Genotype-phenotype correlation in 75 patients with small supernumerary marker chromosomes

Background Small supernumerary marker chromosomes (sSMCs) are rare structural abnormalities in the population; however, they are frequently found in children or fetuses with hypoevolutism and infertile adults. sSMCs are usually observed first by karyotyping, and further analysis of their molecular origin is important in clinical practice. Next-generation sequencing (NGS) combined with Sanger sequencing helps to identify the chromosomal origins of sSMCs and correlate certain sSMCs with a specific clinical picture. Results Karyotyping identified 75 sSMCs in 74,266 samples (0.1% incidence). The chromosomal origins of 27 of these sSMCs were detected by sequencing-related techniques (NGS, MLPA and STR). Eight of these sSMCs are being reported for the first time. sSMCs mainly derived from chromosomal X, Y, 15, and 18, and some sSMC chromosomal origins could be correlated with clinical phenotypes. However, the chromosomal origins of the remaining 48 sSMC cases are unknown. Thus, we will develop a set of economical and efficient methods for clinical sSMC diagnosis. Conclusions This study details the comprehensive characterization of 27 sSMCs. Eight of these sSMCs are being reported here for the first time, providing additional information to sSMC research. Identifying sSMCs may reveal genotype-phenotype correlations and integrate genomic data into clinical care.


Background
Small supernumerary marker chromosomes (sSMCs) are structural abnormalities whose origins cannot be characterized by conventional cytogenetics alone but require molecular approaches. It is known that 70% of sSMCs are de novo, 20% are inherited from the mother, and 10% come from the father [1]. sSMCs are often derived from maternal meiosis I/II errors, trisomic/monosomic rescue, or fertilization errors [2,3]. sSMCs are equal to or smaller than chromosome 20 in size and often have abnormal morphology (e.g., inverted duplication, minute, or ring). Many of them are derived from the short arms or pericentromeric regions of chromosomes. Nearly 70% of sSMC carriers are clinically normal; however, 30% are abnormal. Patients carrying sSMCs have developmental delays, intellectual disabilities, mixed gonadal dysgenesis (MGS), or infertility, depending on the origin of the sSMC. The treatment of these patients was based on different symptoms until the molecular characterization of sSMCs was developed.
In this study, we identified 75 sSMC cases in 74,266 patients seen in our department from 2015 to 2018 by karyotyping. Fifty-seven of the cases were subjected to molecular analysis, and the remaining 18 were not characterized further. Next-generation sequencing (NGS) is a fast high-output sequencing technique used to determine copy number variations [4]. We combined NGS, multiplex ligation-dependent probe amplification (MLPA), and short tandem repeat (STR) analysis to identify the origins of the sSMCs in our study. The molecular components of 27 of the sSMCs were identified. Thirty of the sSMCs subjected to molecular analysis did not have any pathogenic information in original chromosomal. sSMCs were first detected by conventional cytogenetic banding analysis, which is weak for identifying their molecular component. This study aimed to identify the origins of sSMCs diagnosed in our department over the last 4 years. This application may help recognize syndromes from which sSMC patients suffer, establish suitable and specific therapy, or even predict syndromes that will develop in the future. Such an application will be of great value in clinical genetic diagnosis and genetic counseling.

Distribution of cases
A total of 74,266 samples were analyzed for genetic diagnosis from the infertility, pediatrics, and obstetrics departments of Shengjing hospital (Fig. 1). In particular, we studied 75 sSMC carriers (0.1% in total), including 23 adults with infertility or habitual abortion (23/75, 30.67%), 20 children with severe developed delay, MGS or gynandromorphism (20/75, 26.67%), 23 fetuses with intrauterine growth retardation or abnormal ultrasonic structures (23/75, 30.67%), and nine unsyndromatic sSMC cases (9/75, 12%). We performed NGS, MLPA, and STR on 57 sSMCs and identified the chromosomal origins for 27 of these cases ( Table 1). The chromosomal origins of the remaining 48 cases are still unknown ( Table 2). These data suggested that most sSMC cases have clinical syndromes, which might be correlated with their clinical phenotypes.
sSMCs from chromosome Y Twelve sSMCs were derived from chromosome Y. Patients 61166 and W02938 were sexually abnormal boys, showing similar characteristics to Turner syndrome with androgynous. Results showed the sSMCs were derived from a minute Y chromosome with SRY ( Fig. 2A, B). Patient 69433 grew up as a girl. The MLPA analysis indicated that the sSMC was derived from min(Y) (Fig. 2C). Patients 61680, 62091, 77297, 80794, 98139 and W01824 were adult men with azoospermia and infertility. STR analysis showed that their sSMCs came from min(Y) (Fig. 2D-I, Table 3). Samples 150677, 162047, and 171276 were from amniotic fluid. The STR analysis results demonstrated that the sSMCs were from min(Y) (Fig. 2J-L).   sSMCs from chromosome 15 The sSMCs of six patients were derived from chromosome 15. NGS identified duplications on chromosome 15 for patients 69813 and W03987 (Fig. 3A, B). MLPA revealed that patients 70532, 83411, and 96862 had a heterozygous duplicated mutation at 15q11.2 ( Fig. 3D-F). These five patients carried sSMCs derived from inv dul (15). The sSMC of patient W04210 was from min(15) (Fig. 3C). Five of these cases showed clinical features of Dup15q syndrome (e.g., hypoevolutism or autism). In sSMCs from chromosome X The sSMCs of two patients were derived from chromosome X. These patients showed characteristics of Turner syndrome. NGS indicated that the sSMC of patient 92568, which was mosaic (45,X/46,X,+mar), might be from r(X) (Fig. 4A). The sSMC of patient W09834 was partial 45,X and composed of min(X) (Fig. 4B).
sSMCs from other chromosomes NGS showed that patient 96932 had a complex sSMC that might be derived from min(X) and min(Y) (Fig. 6A). This patient displayed similar characteristics to Turner syndrome. The sSMC of fetus 172990 was derived from min(9) (Fig. 6B). The sSMC of patient 70963, who showed compound features of partial trisomy 20p and 20q11.22 duplication syndrome with pygmyism and asitia, was derived from min(20) (Fig. 6C). The sSMC of fetus 160246 was derived from min(11) (Fig. 7A-a, b). When her mother got pregnant again, the fetus carried the same balanced translocation (Fig. 7A-c). The sSMC of fetus 184290 was derived from inv. dup(22) (Fig. 7B).

sSMCs of unknown chromosomal origin
Although several techniques were used to identify the origin of the different sSMCs, 48 patients could not be diagnosed ( Table 2). Amniotic fluid samples containing sSMCs were submitted for STR analysis, and only seven sSMCs were identified. From karyotyping, these unidentified sSMCs were classified into three groups (Fig. 8).
Group I sSMCs consisted of inverted duplicated chromosomes. Those in group II were likely minute chromosomes, while those in group III looked like ring chromosomes.

Discussion
In this study, we identified the origins of 27 sSMCs, of which, eight sSMCs are being reported for the first time (Table 1). Of the 27 defined sSMC origins, 12 were derived from the Y chromosome and two from the X chromosome. The infertile patients showed azoospermia, and their original Y sSMCs were detected. Azoospermia factor (AZF), which is located on the long arm of Y (Yq11.23), regulates spermatogenesis [7]. These patients had deletions of AZF-a region (the Sertoli cell-only syndrome), AZF-b region (sperm-maturation-arrest syndrome), or all AZF regions resulting in azoospermia.
Thus, artificial insemination with donor sperm or adoption was suggested for clinical management. The pediatric patients carrying sSMCs from min(Y) or chromosome X or complex sSMCs from min(X) and min(Y) had similar characteristics to Turner syndrome; however, they had different phenotypes depending on their sSMC origins. The short arm of X harbors the short stature-homeobox gene (SHOX on Xp22.33) and lymphogenic gene (forkhead box P3, FOXP3 on Xp11.23), which are associated with stature and immunodeficiency or polyendocrinopathy [8]. Patient W09834 with min(X) had a loss of FOXP3 and an immunological problem. A similar sSMC derived from r(X)(::p11.21→q13.1::) was reported in craniofrontonasal syndrome (CFNS) [9]. The methyl-CpG binding protein-2 gene (MECP2 on Xq28) is located on the long arm of X. This gene correlates with RETT syndrome and the premature ovarian failure gene POF (POF1: Xq21→qter, POF2: Xq13.3→Xq21.1) [10]. As the min(X) from patient W09834 (:p11.2→q13.2:) and r(X) from patient 92568 (::p11.23→q21.1::) did not contain SHOX and MECP2, both patients had growth retardation and a high risk of RETT syndrome. As they had the part of POFs, so being attention to ovarian function. Patient 96932 had a complex sSMC from min(X) and min(Y), resulting in a high risk of type II germ cell tumors [11,12]. All the pediatric patients were recommended for individualized treatment according to their genotyperelated phenotypes. Our sSMC patients with the 47,XN,+mar karyotype typically had special duplication syndrome, and six sSMCs were identified from inv dul (15). The region 15(q11.2→q13.3) is a known hot breakpoint. This region harbors the GABAAR genes, the paternal gene SNRPN, and the maternal gene UBE3A, which regulate central neural system development and function [13]. It was rare that two neocentric sSMCs derived from inv dup(18) had the same duplication fragment. There may be a hot breakpoint located at 18(p11.21). In region 18p, approximately 67 genes can contribute to the phenotypes, including AFG3L2, MC2R, and TGIF1, which are associated with developmental disorders [5,6]. So, when taking care of patient 61259, pay attention to artificial feeding, avoiding infections, and evaluating affected organs and systems. The region of 20(p12.3→q11.22) comprises more than 2 hundred genes. Duplication of JAG1, BTBD3, and FLRT3, or ASXL1 induces Alagille syndrome, neurological dysfunction or chromatin remodeling [14,15]. Patient 70963 with the genotype min(20)(:p12.3→q11.22:) showed moderate symptoms due to 60% mosaic.
The identification of sSMCs is vital in prenatal diagnosis. Of the 75 sSMC cases from this study, 23 were from fetuses with intrauterine growth retardation or abnormal ultrasonic structure, and seven fetal sSMC cases were found to have Y, 18, 9, 11, or 22 chromosomal origins. However, most sSMCs failed to define the original chromosome. Three fetal sSMCs from the Y chromosome needed careful evaluation. If the sSMCs correlated with androgyneity or AZF deletion, it was better to complete the pregnancy. However, if a fetus had an inv dup (18) genotype, termination of the pregnancy was suggested because of the i(18p) syndromes. Fetus 172990 had a duplicated region 9(p24.3→p13.1) that correlated with 9p duplication syndrome, which contains a potential autism spectrum disorder (ASD) and a normal IQ individual region [16,17]. The sSMC of fetus 160246 was de novo and arose from a maternal balanced translocation t(11;22)(q23;q12), leading to three copies of 11(q23.3→q25). The sSMC derived from the inv dup(22) chromosome was also de novo. The fetus carrying this sSMC had similar regions to the 22q11.2 duplication syndrome (22DupS), which usually produces birth defects, such as congenital heart disease, hearing loss, hypophrenia, or high risk of psychosis (including autism) [18,19]. A similar sSMC arising from inv dup(22)(q11.1~11.2) was reported with mild clinical signs [20].
Most sSMCs in fetuses are de novo, but a few are inherited from their parents. Thus, prenatal diagnosis and genetic counseling are critical. In our department, parents are asked to fill out a form to collect genetic information. Amniotic fluid is then submitted for both karyotyping and STR analysis. If an sSMC is diagnosed, further testing (e.g., NGS) is suggested, and the karyotypes of the parents are requested. If the parents are sSMC or translocation carriers, the fetus should take further testing. Preimplantation genetic screening (PGS) Although several sequencing-related techniques were used in our study, there were still 30 sSMCs for which pathogenic information could not be generated. It is possible that the sequencing primers did not cover the sSMC regions in the MLPA or STR (AZF) methods. Also, inverted duplicated chromosomes (acrocentric chromosomes), isochromosomes, or minute chromosomes (centromere-nearby regions) might not have been detected by NGS due to the highly repeated sequences at the centromere regions, which will be improved in read depth, inducing read pair, split pair, or assembly-based analysis of NGS. Thus, a set of efficient techniques should be developed for further sSMC identification.

Conclusions
In summary, the sSMCs of the study patients were different in origin, size, replication times, affected genes, and mosaicism levels. Thus, their clinical manifestations varied. This study detailed the comprehensive characterization of 27 sSMCs. Eight of these sSMCs are being reported here for the first time, which provides additional information for sSMC research. The identification of sSMCs could reveal genotype-phenotype correlations and integrate genomic data into clinical care.

Patients' collection
This research investigated 74,266 patients' specimens in our department from 2015 to 2018, including 50,794 peripheral bloods from adults, 6,350 peripheral bloods from pediatrics, 14,759 amniotic fluids, and 2,363 cord bloods. 75 sSMC carriers were diagnosed by karyotyping (Tables 1 and 2), containing 52 live births, and 23 fetuses. Some of them took further detection (e.g., NGS, MLPA, or STR). Then we identified the molecular component of 27 sSMC cases. They were compared with the information in http://cs-tl.de/DB/CA/sSMC/0-Start. html. These retrospective studies were approved by the

Chromosome karyotyping
Patients' peripheral blood and amniotic fluid samples were cultured, harvested, and stained with Giemsa (Gbanded) (at the resolution of approximately 300-400 bands) following the standard protocols. Then scanned in Lieca Cyto Vision (German) and analyzed according to the ISCN 2013.

MLPA
MLPA was performed in FGS (3730 DNA Analyzer, Singapore) by the protocol of "SALSA® MLPA® P245 Microdeletion Syndromes-1" kit (MRC Holland, Amsterdam, the Netherlands). The preparation of DNA samples was same as STR. MLPA could suggest 23 kinds of deletion or duplication syndrome. Sequencing primers were illustrated in protocol, including one of Xp21.1, three of Xq28, three of 15q11.2 (one UBE3A probe and two SNRPN probes), and one (Y-fragment S0135-