Validation and implementation of array comparative genomic hybridisation as a first line test in place of postnatal karyotyping for genome imbalance

Background Several studies have demonstrated that array comparative genomic hybridisation (CGH) for genome-wide imbalance provides a substantial increase in diagnostic yield for patients traditionally referred for karyotyping by G-banded chromosome analysis. The purpose of this study was to demonstrate the feasibility of and strategies for, the use of array CGH in place of karyotyping for genome imbalance, and to report on the results of the implementation of this approach. Results Following a validation period, an oligoarray platform was chosen. In order to minimise costs and increase efficiency, a patient/patient hybridisation strategy was used, and analysis criteria were set to optimise detection of pathogenic imbalance. A customised database application with direct links to a number of online resources was developed to allow efficient management and tracking of patient samples and facilitate interpretation of results. Following introduction into our routine diagnostic service for patients with suspected genome imbalance, array CGH as a follow-on test for patients with normal karyotypes (n = 1245) and as a first-line test (n = 1169) gave imbalance detection rates of 26% and 22% respectively (excluding common, benign variants). At least 89% of the abnormalities detected by first line testing would not have been detected by standard karyotype analysis. The average reporting time for first-line tests was 25 days from receipt of sample. Conclusions Array CGH can be used in a diagnostic service setting in place of G-banded chromosome analysis, providing a more comprehensive and objective test for patients with suspected genome imbalance. The increase in consumable costs can be minimised by employing appropriate hybridisation strategies; the use of robotics and a customised database application to process multiple samples reduces staffing costs and streamlines analysis, interpretation and reporting of results. Array CGH provides a substantially higher diagnostic yield than G-banded chromosome analysis, thereby alleviating the burden of further clinical investigations.


Background
Karyotype analysis of G-banded chromosomes is the cytogenetic standard for the detection of copy number imbalance across the genome, or balanced chromosome rearrangements, in children with such features as idiopathic developmental delay, learning difficulties, congenital abnormalities or autism. However, this technique has a resolution of only 3-5 Mb, and interpretation is operator-dependent, requiring highly trained and specialised staff to carry out the analysis. Higher resolution targeted tests such as fluorescence in situ hybridisation (FISH) [1] and multiplex ligation-dependent probe amplification (MLPA) [2,3] have, over the years, been added to the cytogenetic repertoire in order to increase the diagnostic yield in this group of patients.
Comparative Genomic Hybridisation (CGH) was first introduced in the early 1990s for the detection of DNA amplification in tumours [4], and used metaphase spreads as hybridisation targets. The development of array CGH, where short stretches of DNA on glass slides are used as targets instead of metaphase spreads, increased the resolution of the CGH approach and allowed the implementation of this test for more widespread use, particularly in tumour cytogenetics [5,6].
Array CGH has the potential to deliver a higher resolution test compared with the 3-5 Mb limit of G-banded chromosome analysis, and has added advantages of objectivity and high throughput. However, the introduction of this test into constitutional diagnostic cytogenetic services has been slow, mainly due to the expense of the consumables and to the wide-spread and established acceptance of karyotype analysis as the firstline test for genome-wide copy-number imbalance.
We have validated array CGH by testing and comparison of different platforms and hybridisation and analysis strategies, and introduced the validated protocol as a first-line test for patients with suspected copy number imbalance. Here we describe our validation studies, and the results of our diagnostic service testing from 04/2008 to 12/2009.

Patients
In the initial, validation, phase of this study (06/2006 to 03/2008), DNA from patients with known abnormalities, detected by G-banded chromosome analysis, was tested to confirm and validate the array procedure. In addition, patients considered likely to have submicroscopic chromosome imbalance, but with normal karyotypes, were selected and referred by our Clinical Genetics team. Results were reported as research findings only. When array CGH was introduced as a validated procedure (04/ 2008), testing was still restricted to those referred through our clinic following a stringent gate keeping of any requests. For these patients, formal diagnostic reports were issued. Genetic testing in the UK is funded by the National Health Service (NHS), with testing approved and commissioned by local NHS genetics commissioning bodies, based on evidence of test cost-effectiveness; following costings approval by our NHS genetics Commissioners, array CGH was introduced as a first line test in place of karyotype analysis on 05/2009, and was then applied to all samples from our region with referral indications suggesting copy number imbalance.

DNA extraction and QC
DNA was extracted from blood samples in EDTA using the Chemagic Automated DNA Separation System (Chemagen, Germany) following the manufacturer's instructions. All DNA was quantified by spectrophotometry (Nanodrop, USA) and checked for degradation on an agarose gel. Degraded DNA was not used; new samples were requested for these cases.
Labelling DNA for array CGH DNA (1 μg) was labelled using CGH Labelling Kit for BAC Arrays (Enzo Life Sciences, USA) or CGH Labelling Kit for Oligo Arrays (Enzo Life Sciences, USA) as appropriate, following the manufacturer's instructions.

Purification and QC of labelled DNA
Labelled DNA was purified post-labelling using QIAquick PCR Purification Kit (Qiagen, USA) following the manufacturer's instructions. Labelling efficiency and yield was assessed by spectrophotometry (Nanodrop, USA).

Processing arrays
Hybridisation, washing and scanning of arrays was performed following the manufacturers' protocols.

Hybridisation strategies
Three different strategies were considered: i) "dye swap" -hybridisation of patient DNA against control DNA, with a repeat assay but with labelling in opposite colours. Two arrays were required for each patient sample, and this strategy gave confidence that the array findings were not artefactual, and controlled for bias in cyanine (Cy) dye labelling reaction efficiencies. ii) "loop" [7]-three patient samples labelled in each of two colours, then each sample hybridised against the other two patient samples. Only one array was required for each patient sample, and if the two labelling runs were carried out on separatelyaccessed DNA aliquots, then no confirmation of sample identity was required. This strategy also gave confidence that the array findings were not artefactual, and controlled for bias in Cy dye labelling reaction efficiencies. iii) "patient/patient" -differentially-labelled patients hybridised against each other. Patients were phenotype mismatched as any shared imbalance would not be detected. Cy dye ratios informed "ownership" of any detected imbalance (log2 ratios of -1 and 0.6 indicate deletion and duplication respectively), which was confirmed by a second array or by MLPA for any syndromic imbalance covered by MLPA kits P064 and P245 (~6% of cases; MRC-Holland, Netherlands). This second test also confirmed sample identity. Therefore the first round of array testing required half an array; for each detected imbalance (in approximately 25% of patients), a further half array is required. This therefore resulted in an overall expenditure of 5/8 of an array per patient, providing substantial cost savings over other approaches.

Array data analysis
All arrays were scanned at 5 μm resolution (Agilent, USA).

BAC arrays
Image quantification, array quality control and aberration detection were performed using BlueFuse software (BlueGnome, UK). 95% of array data was required to pass QC. Samples failing this QC were repeated. A normal range for log2 fluorescence ratios was set to 0 +/-4 SD.

Oligo arrays
Image quantification, array quality control and aberration detection were performed using Feature Extraction and DNA Analytics software packages (Agilent, USA) for oligo arrays. Manufacturer's recommendations were followed. 95% of array data was required to pass QC. Samples failing this QC were repeated. Two rounds of aberration calling were employed, ADM-2 algorithm at threshold 6 (with a 3 probe sliding window providing a mean detection interval of 200 kb) for detection of copy number imbalance, and ADM-1 algorithm at threshold 6 for detection of low-level mosaicism. Each called aberration was assessed by an analyst to provide a further measure of quality, to examine the genomic context of the aberration and to identify any specific genes/regions where further investigations might be required.

Fluorescence in situ hybridisation
BAC array findings were confirmed by fluorescence in situ hybridisation (FISH) using the same clones (Bluegnome, UK and BACPAC, Sanger Institute, UK) as on the array [8].

Multiplex ligation-dependent probe amplification
MLPA was used to confirm imbalance of syndromic regions represented in MLPA kits P064 and P245 (MRC-Holland, Netherlands). Furthermore, inheritance studies were performed by incorporating customdesigned MLPA probes, specific for loci within the region of imbalance, into MLPA kit P200 (MRC-Holland, Netherlands). Custom MLPA probes were designed using the H-MAPD tool [9]. All MLPA reactions were performed following the manufacturer's instructions.

QF-PCR
Quantitative Fluorescence PCR was carried out as previously described [10] Reporting Imbalances of regions represented in the Database of Genomic Variants (DGV) [11] as being present in the normal population (in at least three non-BAC based studies) were recorded but not reported. All other imbalances outside areas of known syndromic imbalance were reported as preliminary findings, with a request for parental blood samples to establish inheritance. Inherited imbalances were reported as probably benign, although "unusual features shared by the proband and carrier parent may be associated with the imbalance". Genes in regions of de novo imbalance were described if considered relevant to the referral indication. The average reporting time for first-line tests was 25 days from receipt of sample.

Final diagnostic service protocol
Patient samples were processed as detailed above, on the Agilent 44 K platform (Wessex NGRL design 017457). A patient/patient hybridisation strategy provided substantial consumable and labour savings while maintaining diagnostic yield. Figure 1 shows an example workflow for two patients, one of whom carries an imbalance, that are hybridised against each other. For findings not present in DGV, Cy3/Cy5 signal ratios were used to determine which hyb partner carried the imbalance. A second follow-up array was used to confirm imbalance in this patient, before reporting both hyb partners. Once parental samples were received, custom MLPA or array testing were used as appropriate for inheritance studies.

Karyotype audit
Array CGH testing does not detect balanced rearrangements; we therefore audited a 10-year period of karyotype results from our laboratory, which showed that of a total of 36,663 postnatal blood samples received, 4 (0.0001%) were found to have de novo reciprocal translocations.

Array validation
Both BAC array and oligoarray validation included "selfself" hybridisations and DNA from cases of known imbalance including trisomies for chromosomes 13 and 18, and deletions and duplications ranging from 0.5-7.1 Mb. Both array platforms gave the expected results from these validation experiments. Further to these known abnormalities, a number of patients with normal karyotypes but considered likely to carry imbalance were also tested. Where imbalance was detected by BAC array, FISH with the abnormal clones was carried out where possible. For 3/16 (18%) single clone results, the FISH result was not concordant with the BAC array findings. This was considered to be due to the differing sensitivities of the techniques. All larger imbalances detected by BAC arrays were confirmed by FISH. Abnormalities detected on oligoarrays were followed up by a variety of techniques (MLPA, QF-PCR, retrospective G-banded chromosome analysis) in 68 cases; no discrepancies were found. 42 patients (with normal karyotypes) were tested on both BAC and oligo array platforms. 30 of these gave normal results on both platforms while 6 gave abnormal results on both platforms. In these concordant cases (Table 1), the resolution of the breakpoints given by the oligo arrays was higher than that by the BAC arrays, due to higher probe density across the genome and the smaller size of the oligo probes. The remaining 6 discordant results (Table 2) all demonstrate imbalances that were not detected by the BAC platform. Most notable is a patient with imbalance of the 16p11.2 autism susceptibility locus [12].

Diagnostic service
A total of 2,414 patients were tested as part of the diagnostic service. Overall, 24% of patients tested had copy number variants (CNVs) not represented in the DGV, and therefore considered to be potentially pathogenic Figure 1 Array CGH diagnostic service workflow. This example illustrates the diagnostic workflow for two patients, one of whom has a previously undetected imbalance of potential clinical significance. Yellow shading signifies automated procedures (we plan to also automate labelling, hybridisation and washing). All patients with imbalance not represented in the DGV were followed up by other techniques or repeat arrays prior to reporting. Parental samples were requested in all cases, but only received for 40%. These inheritance studies showed that de novo imbalances accounted for 35% of findings for patients tested after a normal karyotype, and 13% of findings from first line tests (Figure 2). At least 89% of the abnormalities detected by first line testing would not have been detected by karyotype analysis (assuming a 3 Mb detection threshold for imbalance detection by G-banded chromosome analysis). Overall, 14% of imbalances detected fell within known sites of recurrent microdeletion/duplication.

Discussion
Array CGH testing does not detect balanced rearrangements, and therefore may miss a clinically significant karyotype where a de novo balanced rearrangement may be disrupting gene function. We therefore carried out an audit of postnatal karyotype results at our Centre which showed that over this period 0.0001% of samples had a de novo reciprocal translocation. Inherited translocations would be considered incidental findings, which, although potentially of value for reproductive counselling for the parents, would not be relevant to the proband's referral. The increase in detection rate of chromosome imbalance using our strategy is therefore a major benefit, whereas the number of clinically significant chromosome rearrangements that remain undetected is extremely low.
The validation study indicated that the oligonucleotide approach was robust and cost-effective, as well as having a higher resolution than BAC arrays (see Tables 1 and 2), with no false positive findings when followed up with    different methodologies, or false negative findings when samples with abnormal karyotypes were tested. The choice of the 4 × 44 K array platform was pragmatic, with the aim of maximising detection of clinically significant imbalances whilst minimising costs. This approach was therefore introduced as a diagnostic test in May 2008, in tandem with a comprehensive education initiative for local users of the service. Since then, 2,414 patients have been tested, with an overall imbalance detection rate of 24% (excluding previously published population polymorphisms (see above)). Previously published reports of detection rates in cohorts of patients with congenital anomalies, mental disability or dysmorphism, are heterogeneous in design and application; those reporting cohorts of ≥ 100 patients include those using tiling-path BAC arrays [13], 1 Mb BAC arrays [14][15][16], targeted BAC arrays [17][18][19], and oligonucleotide arrays [20][21][22]. The majority of patients in these cohorts had normal karyotypes, and some had also had subtelomeric imbalance excluded using either FISH or MLPA. The overall imbalance detection rates ranged from~8% (using targeted BAC arrays) [17] to 20% (using 1 Mb BAC arrays [14] or oligonucleotide arrays [22]). These previous studies therefore indicate that the protocols described here, including the patient/patient hybridisation strategy, did not compromise the diagnostic yield in this cohort. Figure 2 shows that de novo abnormalities were less prevalent when testing was carried out as a first line test. This is likely to be due to stringent selection for post-karyotype array CGH of patients likely to carry submicroscopic imbalance, carried out by expert dysmorphologists and clinical geneticists. The cohort of patients tested by array CGH as a first line test included those with referrals more likely to be associated with polygenic and multifactorial conditions such as autism and Attention Deficit Hyperactivity disorder, where genome imbalance is less likely to be found, but needs to be excluded. De novo abnormalities are generally considered likely to be pathogenic and causative of the patient's phenotype, whilst it has been suggested that inherited imbalances should be classified as "false positive" findings [23]. However, in some families, the carrier parent may show mild manifestation of the clinical phenotype affecting the child, and expert phenotypic assessment of the carrier parent is necessary to ascertain this; indeed, in our patient group we have found families where the imbalance was present in more than one other family member, all of whom had phenotypes that were very much milder than that of the proband. The possibility that a deletion unmasks a recessive allele in the proband is another reason for careful consideration of the gene content of inherited as well as de novo imbalances. As more information on the function and copy-number sensitivity of genes is accumulated, the interpretation of such findings is likely to become more straightforward.

Rationale for service strategy
The protocols described here were developed with the aim of providing an array CGH service as a first line test with little or no additional cost over and above that of our current karyotype analysis. This cost includes confirmatory testing of detected imbalance and subsequent inheritance studies.
The following factors contributed to reducing costs and increasing efficiency of array CGH testing: i) Patient/patient hybridisation: this strategy halved consumable costs, and increased the efficiency of analysis (for normal arrays, only one analysis was required for two patients). The disadvantage of this approach is that, were patients with the same imbalance to be hybridised against each other, the abnormality would not be detected (two patients with deletion and duplication of the same region respectively would be detected). Our protocol therefore included a careful consideration of patient referral information when constructing run sheets for hybridisation (phenotype mismatching), with the aim of reducing the incidence of missed abnormalities. However, with the sparse clinical information provided on many referral forms, some imbalances may have been missed. Data from our cohort suggests a less than 1 in 25,000 chance of Figure 2 Comparison of de novo imbalance detection rates. Proportion of de novo findings in first line array CGH and postnormal karyotype array CGH cohorts. Higher rates of de novo findings in the post-normal karyotype cohort reflect the stringent clinical selection of these patients; these patients would have been diagnosed at an earlier stage had array CGH been carried out as the first line test.
failing to detect 16p11.2 deletion syndrome (our most common finding, present in 15 (~0.62% of patients) even without phenotype mismatching. Furthermore, the abnormality pick-up rate for array CGH testing is considerably higher than for karyotype analysis, and any abnormalities missed as a consequence of the hybridisation strategy would likely be beyond the resolution of G-banded chromosome analysis. ii) Batch testing: processing of multiple samples for labelling, clean-up and hybridisation considerably reduced the technical time required for testing, allowing a throughput of 96 samples/week. iii) Robotics: automating DNA purification reduced technical input time, and increased consistency of processing. We aim also to automate DNA labelling in the near future to reduce costs and reporting times further, to allow increased throughput, and to improve quality and consistency. iv) Dedicated IT resources: a database application was designed and built during the validation phase of this study. This allowed rapid processing of information, audit, sample tracking and construction of run sheets, with click-through links to internet resources, in particular the DGV, to expedite diagnosis and interpretation of findings.
The combination of the above resources and strategies has enabled us to offer this test, including confirmatory and inheritance follow-up testing, at the same cost as karyotype analysis. We recognise that other centres will have different financial and staffing constraints and may not have access to some of the equipment and expertise required, but nevertheless we feel that aspects of this approach may be of interest to those wishing to introduce array CGH testing into their diagnostic service.
The clinical utility and cost-effectiveness of array CGH have been discussed in previous papers, which concluded that, even without the increases in efficiency described in this paper, array CGH results in a cost per diagnosis less than that for karyotype analysis [24], and predicts that array CGH is likely to become the genetic test of choice for patients with suspected copy number imbalance [25].

Conclusions
This study demonstrates the feasibility of using oligonucleotide array CGH as a first line diagnostic test in place of G-banded chromosome analysis for genome-wide constitutional imbalance. This approach can be implemented with minimal cost implications using the protocols described. Array CGH provides a substantially higher diagnostic yield than G-banded chromosome analysis, thereby alleviating the burden of further clinical investigations, and is providing valuable information on the link between gene copy number and expression of abnormal phenotype.