Design and validation of a pericentromeric BAC clone set aimed at improving diagnosis and phenotype prediction of supernumerary marker chromosomes

Background Small supernumerary marker chromosomes (sSMCs) are additional, structurally abnormal chromosomes, generally smaller than chromosome 20 of the same metaphase spread. Due to their small size, they are difficult to characterize by conventional cytogenetics alone. In regard to their clinical effects, sSMCs are a heterogeneous group: in particular, sSMCs containing pericentromeric euchromatin are likely to be associated with abnormal outcomes, although exceptions have been reported. To improve characterization of the genetic content of sSMCs, several approaches might be applied based on different molecular and molecular-cytogenetic assays, e.g., fluorescent in situ hybridization (FISH), array-based comparative genomic hybridization (array CGH), and multiplex ligation-dependent probe amplification (MLPA). To provide a complementary tool for the characterization of sSMCs, we constructed and validated a new, FISH-based, pericentromeric Bacterial Artificial Chromosome (BAC) clone set that with a high resolution spans the most proximal euchromatic sequences of all human chromosome arms, excluding the acrocentric short arms. Results By FISH analysis, we assayed 561 pericentromeric BAC probes and excluded 75 that showed a wrong chromosomal localization. The remaining 486 probes were used to establish 43 BAC-based pericentromeric panels. Each panel consists of a core, which with a high resolution covers the most proximal euchromatic ~0.7 Mb (on average) of each chromosome arm and generally bridges the heterochromatin/euchromatin junction, as well as clones located proximally and distally to the core. The pericentromeric clone set was subsequently validated by the characterization of 19 sSMCs. Using the core probes, we could rapidly distinguish between heterochromatic (1/19) and euchromatic (11/19) sSMCs, and estimate the euchromatic DNA content, which ranged from approximately 0.13 to more than 10 Mb. The characterization was not completed for seven sSMCs due to a lack of information about the covered region in the reference sequence (1/19) or sample insufficiency (6/19). Conclusions Our results demonstrate that this pericentromeric clone set is useful as an alternative tool for sSMC characterization, primarily in cases of very small SMCs that contain either heterochromatin exclusively or a tiny amount of euchromatic sequence, and also in cases of low-level or cryptic mosaicism. The resulting data will foster knowledge of human proximal euchromatic regions involved in chromosomal imbalances, thereby improving genotype–phenotype correlations.

The phenotypic expression of sSMCs ranges from asymptomatic to symptomatic, and depends on several factors including chromosomal origin, satellite vs. nonsatellite inclusion, euchromatic/heterochromatic content, uniparental disomy (UPD) of the chromosomes homologous to the sSMC, and mosaicism [3]. Furthermore, the presence of centromere-proximal euchromatin on an sSMC correlates with abnormal phenotypes, although several exceptions have been described [4]. Since the optimal strategies for genetic counseling and clinical management depend on the characteristics of sSMCs, it is vitally important to precisely characterize sSMCs in order to obtain additional information regarding their phenotypic effects. To this end, several fluorescent in situ hybridization (FISH)-based techniques have been developed over the years [5] for determining the origin of sSMCs and allowing breakpoint characterization, at least in cases of larger euchromatic SMCs. These methods include multicolor FISH (M-FISH) [6], spectral karyotyping (SKY) [7], centromere-and subcentromere-specific M-FISH (cenM-FISH and subcenM-FISH) [3,8,9], multicolor banding [10], and microdissection followed by reverse FISH [11,12]. More recently, a pericentric-ladder-FISH (PCL-FISH) probe set has been developed based on 174 locus-specific BAC probes, and this probe set has been used in dual-color/multicolor-FISH approaches. This tool is specific for the pericentromeric regions and, therefore, enables sSMC breakpoint characterization with a resolution between 1 and~10 Mb [13].
Furthermore, array-based comparative genomic hybridization (array CGH) analysis has been extensively used in sSMC characterization. This method allows, in a single experiment, determination of the marker chromosomal origin, definition of the size of aberrations (including euchromatic regions), and identification of complex rearrangements or multiple markers in single individuals [14][15][16][17][18][19][20]. However, array CGH may fail to identify the origins of very small SMCs in up to 50% of cases because its pericentromeric coverage is limited to the presence of segmental duplications, and it may also be unable to detect low-level and cryptic mosaicism [13,[19][20][21]. Consequently, it is necessary to complement array CGH using FISH approaches [13,22]. In addition, to allow rapid discrimination between sSMCs that are positive or negative for unique sequences, an alternative approach using multiplex ligation-dependent probe amplification (MLPA) analysis has recently been developed for use in the context of prenatal diagnosis [23].
In this study, we report the design and validation of a new pericentromeric Bacterial Artificial Chromosome (BAC) clone set that covers the most proximal euchromatic sequences of all human chromosome arms, as well as the heterochromatin/euchromatin junctions, excluding the short arms of acrocentric chromosomes. This set was designed to improve molecular characterization of sSMCs by FISH analysis, a molecular-cytogenetic technique that, in contrast to array CGH, is available in most cytogenetic laboratories. This new complementary tool will be especially useful in cases of low-level mosaicism and/or very small marker chromosomes, which are likely to consist entirely of heterochromatin or contain only a tiny amount of euchromatic sequence, as demonstrated by some reported sSMC cases.

Molecular-cytogenetic characterization of 19 sSMCs
The utility of the clone set was then validated by the molecular characterization of 19 sSMCs, six of which were ascertained during routine prenatal testing (Additional file 2: Table S2). Nine sSMCs (~47%) were derived from acrocentric chromosomes, six (~32%) were inherited (patients 6, 8, 10-13), of which two were detected in two siblings (patients 10 and 11) and are complex marker chromosomes originating from a maternal balanced rearrangement. Multiple markers occurred in a single adult patient (5%) (patient 15). Mosaicism was detected in seven patients (~37%) (patients 1-5, 15, and 18), in most cases involving non-acrocentric chromosomes. Uniparental disomy (UPD) analysis of sSMC sister chromosomes was not performed. Details of sSMC characterization are provided in Additional file 2: Table S2.
Using the core panel probes, we were able to rapidly distinguish between heterochromatic (1/19) and pericentromeric euchromatic marker chromosomes (11/19), even in cases of low-level mosaicism. In addition, we either precisely established or estimated the size of the euchromatic content in 17 out of 19 sSMCs (Additional file 2: Table S2, Figures 1, 2 and 3). The euchromatic DNA present on the sSMCs ranged from~0.13 Mb to more than 10 Mb (Additional file 2: Table S2). However, sSMC characterization was not completed in patient 6, due to the incompleteness of the chromosome 21q core panel physical map in the reference sequence, and in patients 7-11 and 15, due to sample insufficiency (Additional file 2: Table S2).
In two patients, we detected large marker chromosomes in high-level mosaicism, as follows: a ring chromosome 4 [r(4)] in patient 3, and an r(11) in patient 5 (Additional file 2: Table S2, Figure 2). The molecular characterization of these SMCs was performed using the panel probes and subsequently refined by array CGH analysis, which allowed a more precise definition of the breakpoints.

Discussion
Pericentromeric regions of human chromosomes are transitional territories between centromeric heterochromatin and euchromatic regions. They represent complex mosaic structures, including coding sequences interspersed with non-coding sequences [24]. Therefore, sequencing of these regions is technically difficult, and a complementary approach is necessary to clarify their role in human disease.
sSMCs generally contain a centromeric/pericentromeric region, and their precise characterization is a powerful tool for identifying which genomic regions lead to abnormalities when they are affected by dosage imbalances. Over the years, numerous FISH-based approaches have been developed, and these methods have contributed to improvements in sSMC characterization. However, assays such as M-FISH/SKY [6,7], cenM-FISH [8] and subcenM-FISH [3,9] are limited to the identification of the sSMC chromosomal origin or the characterization of larger euchromatic SMCs, and only the use of FISH-banding or locus-specific probes [13,25,26] can improve breakpoint characterization [13,27,28]. In particular, the PCL-FISH probe set, recently developed by Hamid et al., (2012) [13], is a bar-code FISH assay that constitutes a 10 Mb raster along pericentromeric chromosomal regions, allowing the determination of mosaic and non-mosaic sSMC  The location of the heterochromatic region is an approximation taken from the February 2009 release of the UCSC Human Genome Browser Database, which indicates the distance from the most distal available parm sequence to the centromere start and end for that chromosome. For chromosomes 1q, 9q, and 16q, the heterochromatic region ending position indicates the telomeric border of the heterochromatic bands 1q12, 9q12, and 16q11.2, respectively. b The distance between the heterochromatic region and the most proximal available euchromatic BAC clone indicates the starting position of the pericentromeric coverage of the BAC core panel for each chromosome arm. c The distance between the proximal and distal euchromatic core BAC clones indicates the pericentromeric euchromatic coverage of the core panel for each chromosome arm; if the proximal BAC clone overlaps with heterochromatin/euchromatin bridge, the distance refers only to the euchromatic sequence. The core panels of chromosomes 1p, 10q, 11p, 20q, and 21q were not completed because of the lack of heterochromatin/euchromatin-bridging physical maps in the reference sequence. The core panels of chromosomes 1q and 9p are incomplete due to the presence of pericentromeric paralogous segmental duplications; therefore, the corresponding core panel probes were excluded from the clone set because their hybridization signals do not give unique mapping information. The partial core panel of chromosome 10q was established using non-contiguous clones. e The core panel of chromosomes Yp was not established because of the lack of a pericentromeric physical map in the reference sequence.
breakpoints within genomic regions of 1-10 Mb in size. In addition, this approach has been particularly useful in characterizing cryptic mosaic sSMCs [29], and for easily defining all involved breakpoints [13]. However, because it covers the most proximal 1 Mb of euchromatic sequences of each chromosome arm with a very low resolution, PCL-FISH is not useful for the characterization of very small SMCs that contain only tiny amounts of euchromatic sequence [13]. Likewise, a more sensitive technique such as array CGH, which can significantly narrow down sSMC breakpoints [14][15][16][17][18][19][20]30], can still yield incomplete pericentromeric coverage due to the presence of large duplicated sequences. Moreover, array CGH cannot detect low-level and/or cryptic mosaic sSMCs. Therefore, in order to provide a complementary tool for sSMC characterization, we established a FISH-based pericentromeric BAC clone set, including probes that cover the most proximal euchromatic~0.7 Mb (on average) of each chromosome arm at a high resolution. The pericentromeric probe set also includes probes that bridge the heterochromatin/euchromatin junctions (Table 1 and Additional file 1: Table S1), enabling rapid discrimination between euchromatic (11/19 in the present series) and heterochromatic (1/19) sSMCs, irrespective of marker chromosome origin (Additional file 2: Table S2). The most proximal probes were chosen independently of the presence of segmental duplications, with the exceptions of chromosomes 1q and 9p, where pericentromeric paralogous segmental duplications have been detected. As expected, a significant percentage of the assayed probes (~13%) were mislocalized, supporting the need for the large screening effort we performed to verify the predicted physical position of each clone. As previously reported for PCL-FISH [13], we confirmed the utility of our new BAC-probe set in characterizing low-level mosaic sSMCs (Additional file 2: Table S2), suggesting that this approach could be applied to breakpoint identification in cases of cryptic mosaic sSMCs; however, no pertinent cases are present in the series reported here.
In terms of genotype-phenotype correlation, the data we collected regarding sSMC characterization allowed us to confirm the existence of pericentromeric euchromatic critical and noncritical regions surrounding the centromeres of all chromosome arms, i.e., regions in which trisomy or tetrasomy either does or does not correlate with pathological phenotypes [4]. Accordingly, we declare no clinical signs reported in association with acrocentric sSMCs that have breakpoints localized in predicted pericentromeric noncritical regions [4,14,16,22,31] (patients 6-9, 12, and 13), with the exception of patient 7, who exhibited a growth delay likely not associated with the sSMC (Additional file 2: Table S2). Furthermore, in siblings 10 and 11, the 14q breakpoint characterization was useful in linking the reported clinical findings specifically to trisomy of 6pter-p25 (Additional file 2: Table S2). By contrast, we classified the larger idic(22;22) marker

chromosomes (patients 16-18) as type I Cat Eye
Syndrome (CES) chromosomes, which results in the CES phenotype [32], as confirmed in patient 17 (Additional file 2: Table S2). In addition, in patient 18, the molecular characterization of idic(22;22) revealed asymmetrical breakpoints that resulted in both a tetrasomy of~750 kb, which featured the gap reported between the end of the noncritical region and the start of the critical region [4], and a trisomy of~150-400 kb of euchromatic sequences that are included within the 22q predicted critical region [4] (Additional file 2: Table S2, Figure 3E,F). These observations suggest that the mild phenotype of patient 18, relative to the classical CES clinical presentation, can be at least partially attributed to trisomy rather than tetrasomy of the same euchromatic region, combined with low-level mosaicism of sSMC(22) (Additional file 2: Table S2).
Among the non-acrocentric marker chromosomes collected in our series, the SMC characterization revealed breakpoints within pericentromeric noncritical regions in two cases [4,[14][15][16]20,22,33,34], consistent with the observation that the corresponding patients (3 and 15) had normal phenotypes (Figures 2A-C and 3A-D, Additional file 2: Table S2). Notably, the association of sSMC(2) and sSMC (18) observed in patient 15 has not been previously reported. By performing FISH analysis, we identified pericentromeric segmental duplications shared by chromosomes 2 and 18 ( Figure 3A-D), and hypothesized that a complex genomic rearrangement had occurred between those two chromosomes, resulting in sSMC formation and in the simultaneous deletion of the sequence covered by probe RP11-134N21 from sSMC(2) ( Figure 3D, Additional file 2: Table S2).
In regard to pathological non-acrocentric sSMCs, our molecular characterization helped us to either refine or confirm the boundaries of the predicted critical regions and to improve genotype-phenotype correlations. For example, characterization of patient 2's mosaic sSRC(1) indicated possible trisomy of 1p12-q21.1 chromosomal region, involving at most~1.91 Mb of euchromatic sequences at 1q21.1 (Additional file 2: Table S2). Although the boundaries between the critical and noncritical regions on 1q are not yet available [4], we propose that trisomy of 1q21.1 might be responsible for patient 2's phenotype (Additional file 2: Table S2), as suggested by previous data [3,9,[35][36][37][38]. However, neurological abnormalities like those exhibited by our patient have been previously reported in patients carrying mosaic sSMC (1), resulting in trisomy of the 1p12-q12 region. In those cases, the 1q breakpoints were mapped within the 1q12 heterochromatic region, suggesting that the 1p trisomy was responsible for the pathological phenotype [3,8,9,39].
Both breakpoints of patient 5's SRC(11) were inferred to have occurred in the pericentromeric critical regions, leading to mosaic trisomy of the 11p11.12-q13.1 region (Additional file 2: Table S2, Figure 2E,F). To date, only one patient (11-W-p11.12/3-1 in the sSMC database [4]) has been reported to have a mosaic sSRC(11) characterized by breakpoints mapped within the same chromosomal bands found in our patient [4]. The reported pathological clinical presentation, characterized by dysmorphism and severe developmental delay, resembles that observed in patient 5 (Additional file 2: Table S2), confirming the pathogenetic role of mosaic trisomy of 11p11.12-q13.1.
Finally, sSMC(10) of patient 4 and sSRC(16) of patient 14 were both ascertained during prenatal testing. The sSRC(16) exclusively involved centromeric heterochromatin (Additional file 2: Table S2, Figure 1C, D), leading the parents to continue the pregnancy, which ended with the birth of a normal baby. By contrast, the sSMC (10) resulted in mosaic trisomy of the 10p11.23-q11.21 region (Additional file 2: Table S2, Figure 1A,B). We considered that a pathological phenotype might arise due to trisomy of at least 3.6 Mb of 10p proximal euchromatic sequences because the 10q breakpoint mapped in the predicted noncritical region [4,17,40]. This hypothesis was subsequently confirmed by the patient's neonatal phenotype (Additional file 2: Table S2).

Conclusions
To summarize, our data demonstrate the potential value of our pericentromeric clone set for characterization of sSMCs in both prenatal and postnatal diagnostics. Due to the fact that the established resource covers all available pericentromeric regions, it may be particularly useful in cases of very small marker chromosomes, allowing rapid discrimination between heterochromatic and euchromatic sSMCs, as well as precise sizing of imbalances. We also demonstrated the complementarity of FISH analysis using the pericentromeric clone set with array CGH analysis in the characterization of large marker chromosomes.
Apart from ad hoc combinations of different methods, performed when requested, FISH analysis is currently the only available technique for analyzing sSMCs in low or cryptic mosaicism. However, to use this probe set in prenatal cases, cytogenetic labs need to have sufficient resources to store the BAC clones and prepare the probes within a short time. Therefore, although application of this tool in prenatal diagnosis would be beneficial, we strongly recommend that it is used in research aimed at increasing our knowledge of the imbalances of human proximal euchromatic regions, thereby improving genotype-phenotype correlations and the assessment of the genetic risks of supernumerary marker chromosomes.

sSMC samples
Chromosomal samples from peripheral blood lymphocytes and amniotic fluid were collected from five cytogenetic labs. All samples were previously karyotyped by conventional cytogenetics (Q-or G-banding), and the sSMC origin was previously identified by FISH analysis using commercial centromere-specific probes (Vysis, Maidenhead, UK), following the manufacturer's instructions. In both cases, at least 16 metaphases were analyzed. Germany) equipped with Leica filters specific for DAPI, FITC, Cy3, and DEAC. Images were acquired using a charge-coupled device (CCD) camera (Leica) with a magnification factor of 100×. Image analysis was performed using the Leica CW4000-FISH software (version Y1.3.1). In the initial step of sSMC characterization, the most proximal available core probe(s) for the involved chromosome arm(s) was used to discriminate between euchromatic and heterochromatic sSMCs. Next, fine breakpoint mapping was performed using clones within and/or distal to the core panels, depending on sSMC size. A single-color, dual-color, or three-color hybridization approach was chosen depending on the available amount of sSMC sample and the time available to complete the analysis. In case of non-mosaic sSMCs, at least 16 metaphases were analyzed, whereas in cases of mosaic sSMCs, the number of analyzed metaphases decreased proportionally with the level of mosaicism. The FISH protocols of Lichter et al. [41] and Lichter and Cremer [42] were followed, with minor modifications.

Array CGH analysis
Array CGH analysis was performed using the Human Genome CGH Microarray Kit 4 × 44K (Agilent Technologies, Palo Alto, CA), which consists of 42,494 60-mer oligonucleotide probes covering the entire genome with an average spatial resolution of~43 kb. From both test and sex-matched reference (Promega, Southampton, UK) samples, 3 μg of genomic DNA, previously extracted from probands' whole blood using the GenElute TM Blood Genomic DNA kit (Sigma-Aldrich), was processed according to the manufacturer's protocol. Images were obtained using the Agilent Feature Extraction software (version 9.1), and chromosomal profiles were obtained using the ADM-2 algorithm provided by DNA Analytics software (v4.0) (Agilent Technologies).

Consent
Written informed consent to the research investigation, which was approved by the Ethical Clinical Research Committee of Istituto Auxologico Italiano, was obtained from either the adult patients or one of the parents in case of child patients.

Additional files
Additional file 1: Table S1. Detailed list of the collected 486 pericentromeric BAC probes, including the 214 clones which belong to the high-resolution core panels, according to the UCSC Genome Browser Database assembly hg19.
Additional file 2: Table S2. Summary of 19 sSMCs characterized using the pericentromeric BAC clone set, including clinical data of the carrying patients.

Competing interests
The authors declare that they have no competing interests.
Authors' contributions CC: study design, acquisition, analysis and interpretation of data from FISH and array-CGH analyses; manuscript preparation; VE: acquisition, analysis and interpretation of data from FISH analysis, review and approval