Copy number variation and regions of homozygosity analysis in patients with MÜLLERIAN aplasia

Background Little is known about the genetic contribution to Müllerian aplasia, better known to patients as Mayer-Rokitansky-Küster-Hauser (MRKH) syndrome. Mutations in two genes (WNT4 and HNF1B) account for a small number of patients, but heterozygous copy number variants (CNVs) have been described. However, the significance of these CNVs in the pathogenesis of MRKH is unknown, but suggests possible autosomal dominant inheritance. We are not aware of CNV studies in consanguineous patients, which could pinpoint genes important in autosomal recessive MRKH. We therefore utilized SNP/CGH microarrays to identify CNVs and define regions of homozygosity (ROH) in Anatolian Turkish MRKH patients. Result(s) Five different CNVs were detected in 4/19 patients (21%), one of which is a previously reported 16p11.2 deletion containing 32 genes, while four involved smaller regions each containing only one gene. Fourteen of 19 (74%) of patients had parents that were third degree relatives or closer. There were 42 regions of homozygosity shared by at least two MRKH patients which was spread throughout most chromosomes. Of interest, eight candidate genes suggested by human or animal studies (RBM8A, CMTM7, CCR4, TRIM71, CNOT10, TP63, EMX2, and CFTR) reside within these ROH. Conclusion(s) CNVs were found in about 20% of Turkish MRKH patients, and as in other studies, proof of causation is lacking. The 16p11.2 deletion seen in mixed populations is also identified in Turkish MRKH patients. Turkish MRKH patients have a higher likelihood of being consanguineous than the general Anatolian Turkish population. Although identified single gene mutations and heterozygous CNVs suggest autosomal dominant inheritance for MRKH in much of the western world, regions of homozygosity, which could contain shared mutant alleles, make it more likely that autosomal recessively inherited causes will be manifested in Turkish women with MRKH.


Introduction
Approximately 7-10% of women have uterovaginal anomalies [1], but perhaps the most severe is Müllerian aplasia, which is also known as Mayer-Rokitansky-Küster-Hauser (MRKH) syndrome-the name patients prefer [2]. These patients have congenital absence of the uterus and vagina (type I; MIM# 277000), or they may also have associated anomalies such as renal agenesis, skeletal abnormalities, cardiac anomalies, or deafness (type II; MIM# 601076) [3]. Additionally, emotional issues as well as concerns regarding family planning are prevalent for these patients [4]. Although MRKH affects 1/4500-1/5000 females, it accounts for about 10% of the causes of primary amenorrhea in females [5].
There is evidence for genetic transmission, as there are some families with more than one affected MRKH individual [6,7]. In our recent characterization of both North American and Turkish families (n = 147 probands), no family had more than one affected individual, but some had another person with one or more of the associated anomalies [2]. Vertical transmission is challenging to confirm unless the MRKH woman conceive with IVF and use a gestational carrier. Consequently, the genetic etiology of MRKH is largely unknown. To date, only two genes-WNT4 [8][9][10][11] and HNF1B [12]-have confirmed, causative mutations in a handful of MRKH patients. A total of four translocations have been identified in MRKH [13][14][15], but in only one were the breakpoints mapped [15]. Although no gene was directly disrupted, this valuable patient with a translocation involving chromosomes 3p22.3 and 16p13.3 can help pinpoint potential candidate genes that could be affected by a position effect [15].
A number of investigators have utilized chromosomal microarrays (CMAs) in MRKH either by comparative genomic hybridization (CGH) and/or single nucleotide polymorphism (SNP) techniques [16][17][18][19][20][21]. Reported copy number variants (CNVs) identified are abundant, but several have been found repetitively including deletions of 17q12, 16p11, and 22q11 [19]. Deletions and duplications of 1q21.1 have also been described by multiple investigators [16,20,22,23]. These chromosomal regions contain numerous genes, and although they contain promising candidate genes, their role in causation is currently unknown. To date, all of the CNV studies in MRKH have been in mixed, nonconsanguineous, non-autosomal recessive populations. In the present study, we sought to use CMAs to identify CNVs and regions of homozygosity (ROH) in a suspected consanguineous Turkish population to provide additional clues to important candidate genes which might cause autosomal recessive MRKH.

Patients
Nineteen Anatolian Turkish patients with a normal 46,XX karyotype were diagnosed with MRKH in the Department of Obstetrics and Gynecology at Akdeniz University Hospital, Turkey and the study took place there and at the Medical College of Georgia at Augusta University, USA. The study was approved by the Institutional Review Boards at both locations, and each person signed a consent form. All patients had normal breast development and an absent vagina by exam supported by imaging studies. Of these 19, three had renal agenesis and two had hypoplastic ovaries (Table 1). Consanguinity was ascertained by family history when the patient was enrolled in the study. Genomic DNA was extracted from peripheral blood samples of patients and available family members by a non-enzymatic salt-precipitation method as described previously [24].

Copy number variation (CNV) analysis
Copy number variant analysis was performed on all 19 patients and available family members (if a CNV was identified) with the use of an Affymetrix Cytoscan HD array (Affymetrix, Inc., Santa Clara, CA), which contains 750,000 single-nucleotide polymorphism probes and 1.9 million oligonucleotide probes. The lower limit of detection for CNVs was 50 kilobases (kb). One hundred nanograms of genomic DNA was labeled and used along with the Cytoscan reagent kit according to the manufacturer's instructions. The array data were analyzed with Chromosome Analysis Suite software as described previously [25]. Human genome hg19 assembly was used to map genomic coordinates. The identified CNVs were compared with Database of Genomic Variants (DGV, http://projects.tcag.ca/cgi-bin/variation/gbrowse/hg19/) to determine if they were unique or previously identified. The CNVs were also investigated for potential pathogenicity using Decipher (https://decipher.sanger.ac.uk/).

Analysis of parental consanguinity and regions of homozygosity
Patient history was used to ascertain degree of consanguinity in the parents of the MRKH subject. Regions of homozygosity (ROH) analysis was performed on all 19 Turkish patients tested using the Affymetrix Cytoscan  ), which is also known as a coefficient of consanguinity. F ROH was calculated by summing autosomal homozygous DNA basepairs (> 5 Mb includes at least 100 consecutive probes) and dividing by total basepair of autosomal genome DNA [25]. The percentage of autosome/genome homozygosity (CHP Summary) determined by F ROH was analyzed using Chromosome Analysis Suite (ChAS) 1.2 software (Affymetrix Data Analysis Software). The thresholds of the percentage of ROH to predict the degree of consanguinity were taken from Sund et al. [25]. Overlapping homozygous genomic regions in at least two patients were determined by comparing the length of shared sequence.

Results
Five different likely pathogenic CNVs were identified in four of 19 (21%) Turkish patients by CMA ( Table 2), all of whom had isolated (type I) MRKH. One was the previously described 16p11.2 in MRKH, which was a 746 kb deletion, for which a similar sized CNV was seen in DGV six times, but not in Decipher. Note that when any sized CNV that overlaps the 16p11.2 region is considered, this was seen 125 times in DGV and 10 times in Decipher. This patient also had an Xq25 deletion of 768 kb present once in DGV, but not Decipher (any sized CNV 17 times in DGV; none in Decipher). Within the Xq25 deletion, there was only one gene. One patient had 16p13.3 deletion, which was present multiple times in both DGV and Decipher. The other two MRKH patients had duplications of 13q14.11 (once in DGV; not in Decipher) and 1p31.1 (not in DGV or Decipher) ( Table 2). Except for the 16p11.2 deletion, which contained 39 genes, the other CNVs each only had 1-3 genes ( Table 2). Family members for these four MRKH patients were not able to be studied, so it is not known if they are de novo. By history, 11 of the 19 Turkish patients did not know if consanguinity was present, while eight stated that their parents were first cousins. First cousins should share 1/16 (6.25%) of sequence. When ROH were analyzed, the degree of consanguinity was greater than the patient previously reported ( Table 3). Instead of parents being third degree relatives, six were found to be second degree relatives with sharing of 8.8-18.3% loci, one was first or second degree (20% shared loci), and one was first degree (23.5% shared loci). For the 11 for whom no history was known, parents were second degree in one and third degree in three, while the others were third or fourth degree relatives. In total, 14 of 19 (~74%) MRKH patients had parents that were third degree relatives or closer.
In addition, there were 42 regions across the genome in which at least two MRKH patients had overlapping homozygous genomic regions (Table 4 and Fig. 1). The most frequently shared chromosomes were chromosomes 2, 3, and 4. All chromosomes were represented except 11, 16, 19, and 21. The shared regions contained as few as 10 genes or as many as 354 genes. None of the shared regions included the more common 17q12 or 16p11.2 CNVs, but two shared the 22q11.21 CNV region (Table 4).

Discussion
The pathogenesis of MRKH in humans is largely unknown, but could include genetic (germline or somatic cell mutations), epigenetic, and/or environmental etiologies. There is evidence supporting a genetic etiology, as demonstrated by families with more than one affected proband [7]. Although twin studies in which monozygotic twins show greater concordance vs. dizygotic twins support a genetic component [26], there have been few studies in MRKH. Those small number of monozyogotic twins have been discordant for MRKH [27][28][29]. The genetic basis of MRKH is largely unknown except for occasional heterozygous WNT4 or HNF1B mutations [8,12]. Many investigators have performed CMA on MRKH patients and have suggested possible pathogenic CNVs [19,30]. It is interesting to note that these CNVs DGV Database of Genomic Variants, Del deletion, Dup duplication. The number of times a very similar sized CNV is listed for both DGV and Decipher. In parentheses, shown is the number of times a CNV of any size overlapped any portion of our CNV region *RBFOX1 is a gene known in relation to autism. Only patient number 6 had parents who were not consanguineous (4th degree relatives). Patient numbers 7 and 8 had parents that were 3rd degree relatives, while patient 9 had parents that were 2nd degree relatives may be found in isolated MRKH (type I) or those with associated anomalies (type II) [19,30]. In the present study, we found five CNVs in four patients with type I MRKH, three of whom were products of consanguineous parents. This is consistent with the overall 75% rate of consanguinity in our study. The 21% prevalence of CNVs in our largely consanguineous Turkish population does not seem to differ with the prevalence in studies of Europe and North America, which range from 16 to 46% (26% overall in four studies) [17,[19][20][21]. The previously reported 16p11.2 deletion was observed in one patient. Patients with microdeletions at 16p11.2 may show variable clinical features including autism [31], epilepsy, global developmental delay, dysmorphism, behavioral problems, abnormal head size [32], and obesity [32]. Microdeletions at 16p11.2 are also common in patients with type I and type II MRKH [19,21]. This region contains more than 30 genes. The T Box 6 (TBX6) gene located in this region represents an attractive candidate gene, but to date, no causative mutations have been confirmed. This same patient had an Xq25 deletion, which contains one gene-ACTRT1 (actin-related protein T1), which has no proven relation to MRKH at this time. Two other type I patients had CNVs containing only one gene-a 16p13.3 deletion (RBFOX1) and a 13q14.11 duplication (FOXO1). The remaining type I patient had a 1p31.1 duplication containing three genes (ST6GALNAC3, MSH4, and ASB17). The 16p13.3 region and the RBFOX1 gene have been implicated in autism; FOXO1 is a transcription factor; and ST6GALNAC3 is expressed in the reproductive tract. MSH4 is a member of the DNA mismatch repair mutS family necessary for reciprocal recombination and proper segregation of homologous chromosomes at meiosis I. ASB17, which is highly expressed in the testis, is a component of E3 ubiquitinprotein ligase complex that mediates the ubiquitination and subsequent proteasomal degradation of target proteins.
The significance of these CNVs is uncertain at this time, but it is unlikely that the 16p13.3 deletion is involved in the pathogenesis of MRKH because it occurs frequently in both the DGV and Decipher databases. Alternatively, the 16p11.2 CNV has been previously reported in MRKH, and large CNVs similar in size are infrequent in these two databases. The other three are potentially pathogenic CNVs-Xq25, 13q14.11, and 1p31.1.
When the literature is examined, chromosomal regions 17q12, 16p11, 22q11, and 1q21.1 harbor some of the more common CNVs in MRKH [16][17][18][19][20][21]. Deletions of 17q12 generally range from 1.2-1.8 Mb in size and contain1 7-20 genes. Known causative gene and transcription factor HNF1B resides within this region and heterozygous mutations result in maturity onset diabetes of the young type 5 (MODY5). Associated findings with this phenotype may include renal cysts and Müllerian aplasia [12]. LHX1 is another potential causative gene within this region, as the knockout mouse has a phenotype consistent with MRKH. However, there are currently no clear causative human LHX1 mutations, confirmed by in vitro analyses supported by family studies [2,33]. We have recently performed Sanger DNA sequencing on 100 North American  and Turkish MRKH women and none had small insertion/deletions or point mutations in WNT4, LHX1, or HNF1B suggesting variants are rare in these genes [2]. The 22q11 region is involved in the DiGeorge phenotype and other associated disorders, while deletions or duplications of 1q21.1 have been identified in ttype I MRKH. However, their significance to the pathophysiology of MRKH is unknown at this time [30]. Copy number variants are typically heterozygous [2], but since consanguineous marriages are common in Turkey, we sought to determine if MRKH patients had large regions of homozygosity (ROH). Turkish patients in the current study consisted of Anatolianorigin Caucasians, who are predominantly from Antalya, Turkey. As reported by Alper et al. in 2004, the rate of consanguineous marriages in the province of Antalya was found to be 33.9% [34]. People in this region have a greater risk of autosomal recessively inherited genetic diseases. Analysis of ROH may provide a good starting point to determine the genetic basis of disease in the offspring of such consanguineous families. Ours is the first study, to our knowledge, to examine ROH analysis in consanguineous MRKH families by CMA.
It is interesting that nearly three quarters of our Turkish MRKH patients demonstrated consanguinity, as defined by having parents that were third degree relatives or closer. In all eight of our patients who stated their parents were first cousins, all were second or first degree relatives. For the remaining 11 MRKH patients who did not know whether consanguinity was present, 7/11 had parents that were third or second degree  relatives. Therefore, the chance of consanguinity was greater in MRKH patients than reported for Anatolian people in general, which suggests that autosomal recessive loci could be responsible for some causes of MRKH.
Further supporting consanguinity, there were 42 regions across the genome in which at least two MRKH patients had overlapping homozygous genomic regions, most frequently chromosomes 2, 3, and 4. None of the shared regions included the 17q12 or 16p11.2 CNVs, but did include 22q11.21. When putative candidate genes from the literature are surveyed, either based upon probable function and/or animal models, eight genes (RBM8A, CMTM7, CCR4, TRIM71, CNOT10, TP63, EMX2, and CFTR) reside within these shared regions, which could suggest a role in MRKH and a possible founder effect if mutations are discovered ( Table 5).
The inheritance of MRKH is most likely to be autosomal dominant for most of the world based upon heterozygous single gene mutations and heterozygous CNVs. However, the large percentage of consanguinity and shared regions of homozygosity in Turkish MRKH patients suggest the existence of an autosomal recessive form. Ideally, homozygosity mapping followed by whole exome sequencing to pinpoint the causative genes should be done in more patients and their family members to narrow down candidate genomic regions for MRKH. However, our results provide additional candidate genes to study, and we suggest that there may be autosomal recessive causes of MRKH that could be identified in consanguineous Turkish families.

Conclusion
CNVs were identified in approximately 20% of Turkish MRKH patients, but it is unknown if they are causative. It is interesting that the 16p11.2 deletion CNV seen in other populations was also found in a Turkish MRKH patient. Our findings suggest that Turkish MRKH patients have a greater chance of consanguinity than the general Anatolian Turkish population. In contrast to other reports suggesting autosomal dominant inheritance of MRKH, the extremely high rate of shared regions of homozygosity suggests that inheritance of some cases of MRKH in Turkey could be autosomal recessive.