Differentially accessible, single copy sequences form contiguous domains along metaphase chromosomes that are conserved among multiple tissues

During mitosis, chromatin engages in a dynamic cycle of condensation and decondensation. Condensation into distinct units to ensure high fidelity segregation is followed by rapid and reproducible decondensation to produce functional daughter cells. Factors contributing to the reproducibility of chromatin structure between cell generations are not well understood. We investigated local metaphase chromosome condensation along mitotic chromosomes within genomic intervals showing differential accessibility (DA) between homologs. DA was originally identified using short sequence-defined single copy (sc) DNA probes of < 5 kb in length by fluorescence in situ hybridization (scFISH) in peripheral lymphocytes. These structural differences between metaphase homologs are non-random, stable, and heritable epigenetic marks which have led to the proposed function of DA as a marker of chromatin memory. Here, we characterize the organization of DA intervals into chromosomal domains by identifying multiple DA loci in close proximity to each other and examine the conservation of DA between tissues. We evaluated multiple adjacent scFISH probes at 6 different DA loci from chromosomal regions 2p23, 3p24, 12p12, 15q22, 15q24 and 20q13 within peripheral blood T-lymphocytes. DA was organized within domains that extend beyond the defined boundaries of individual scFISH probes. Based on hybridizations of 2 to 4 scFISH probes per domain, domains ranged in length from 16.0 kb to 129.6 kb. Transcriptionally inert chromosomal DA regions in T-lymphocytes also demonstrated conservation of DA in bone marrow and fibroblast cells. We identified novel chromosomal regions with allelic differences in metaphase chromosome accessibility and demonstrated that these accessibility differences appear to be aggregated into contiguous domains extending beyond individual scFISH probes. These domains are encompassed by previously established topologically associated domain (TAD) boundaries. DA appears to be a conserved feature of human metaphase chromosomes across different stages of lymphocyte differentiation and germ cell origin, consistent with its proposed role in maintenance of intergenerational cellular chromosome memory.

and functional organization of chromatin regulates differential gene expression programs essential for processes such as cell growth, division, differentiation and survival [1][2][3]. Alternating cycles of chromatin condensation and relaxation are interwoven amongst these programs producing the dynamic chromosome organization observed throughout the cell cycle. A high degree of condensation is necessary to ensure high fidelity segregation during cell division, however a more relaxed chromatin organization is needed for proper genome access by regulatory and transcriptional machinery to ensure normal cell function in interphase [3][4][5]. Despite constant changes in function and morphology within the cell cycle and during differentiation, new generations of cells are able to accurately re-establish cell (or functional) programming consistent with that of parent cells [6,7]. The understanding of this mechanism remains incomplete. Epigenetic memory has been suggested as one mechanism to regenerate the same genome and epigenome organization in cell progeny [1,8]. Identification and characterization of mechanisms of mitotic memory and bookmarking are ongoing with both tissue dependent and independent mechanisms proposed [9][10][11][12][13].
We have identified non-random, stable differences in condensation between homologous metaphase chromosome alleles (termed differential accessibility or DA) using fluorescence in situ hybridization with short single-copy (sc) sequence DNA probes (scFISH) [14][15][16][17]. DA is a manifestation of differences in chromatin supercoiling between metaphase homologs, which can be abrogated with an inhibitor of topoisomerase IIα [15]. DA has been observed in ~ 10% of scFISH probes developed and corresponding to single copy sequences within clinically relevant regions in the human genome [14,[16][17][18]. DA targets can include genic regions, exons and introns or intergenic sequences. Previous characterization of DA loci on human metaphase chromosomes has been performed with phytohemagglutinin (PHA)-stimulated lymphocytes [14,15]. The role DA plays in the global condensation of chromosomes, transgenerational mitotic memory, or other aspects of nuclear organization remains unknown.
The transgenerational dynamics of chromosome condensation and relaxation must be consistent regardless of cell origin or genomic sequence. As a first step towards addressing these constraints, we examine genomic distribution of DA using linked sets of scFISH probes to define lengths of contiguous DA intervals. We also assessed whether DA was present at the same chromosomal loci among tissues at distinct somatic developmental stages and embryological origins (lymphocyte, bone marrow, and fibroblast cells). Characterizing the domain organization of DA in the genome and investigating DA among different cell types in which DA is found should provide clues into the role, if any, of local sequence compaction during metaphase chromosome condensation.

Differential hybridization patterns for single copy (sc) probes confirmed on normal human metaphase chromosomes by scFISH
The genome distributions of DA intervals and their extent were addressed by FISH hybridization of multiple sc probes (1459-3553 bp) to 7 different chromosomal targets across 5 autosomes. Table 1 indicates scFISH probes used in this study to assess chromatin accessibility. It includes 19 probes, 18 scFISH probes with DA developed in this study and a previously developed 1p36 control probe with equivalent accessibility [EA] [14,17], their chromosomal locations, and genome coordinates.
Chromosomal targets for the 6 anchor probes (bolded sequences) were selected from early human gene mapping studies, predating the publication of the full human genome sequence, in which published FISH images of metaphase chromosomes exhibited differences in hybridization intensities between homologs. We hypothesized that these differences in hybridization intensities might be related to DA (see Methods). In this regard, the selection of loci in this study purposefully differs from probes developed for our prior scFISH studies [16,17,[19][20][21], which enriched for diagnostic EA probes present in expressed genes and in clinically relevant chromosomal regions. The initial DA sc probe developed for the chromosomal region targeted in each gene mapping publication [22][23][24][25][26] is bolded in Table 1. Figure 1 shows examples of hybridized metaphase cells with differences in probe fluorescence intensity between homologs for 3 DA probes: SCAMP2_IVS7-IVS4 (15q24.1), ZNF385D_ cen678130 (3p24.3), FGF6_tel4492 (12p12.3) (first three panels). Below each metaphase cell panel, enlarged hybridized homolog pair images show DA. The homolog with less intense probe hybridization signal results from the chromosomal target being more condensed and less accessible for hybridization. A metaphase cell hybridized with a sc probe exhibiting equivalent accessibility (ie. EA) to both homologs: 3.3_1p36 (1p36.3) is also shown (4th panel). The 1p36 sc probe served as a control probe for EA and was described previously [14].
Cytogenetic samples, prepared from PHA-stimulated peripheral blood from 23 different individuals, were used in this analysis to confirm chromosome location and determine hybridization pattern (DA or EA). ScFISH probe hybridized metaphase cells were initially analyzed qualitatively. For DA probes, a significantly greater proportion of cells demonstrated different probe hybridization intensities between homologs (73-89% of cells), in contrast to the control EA probe 3.3_1p36 that showed a greater proportion of cells with similar probe hybridization intensities between homologs (76% of cells) ( Fig. 2A; Additional file 1: Table S1). This is consistent with our previous findings [14,17]. Loci were determined to show differential accessibility using a two-tailed binomial test with normal approximation. The same test was used to determine equivalent accessibility of probe 3.3_1p36. A two proportion Z-test (α = 0.05) demonstrated no significant difference between the fraction of DA cells scored for the 23 different individual samples that established the accessibility pattern for 17 of the 18 DA probes. The SCAMP2_IVS7-IVS4 probe was an exception, in that while it also showed DA in both samples (ie. > 2/3 of cells, 75% [49/65 cells] vs 92% [44/48 cells]), the proportion of cells with DA differed between samples (p = 0.02). There was also no significant statistical difference in the fraction of cells with EA between individual samples hybridized with EA probe 3.3_1p36. Chromosomal accessibility differences between homologs are stable between unrelated individuals.

Quantification of DA confirms qualitative DA classifications of new sc probes
Hybridizations on metaphase homologs from a subset of newly developed DA probes (XDH_IVS30-IVS27, ZNF385D_cen678130, DUOX1_IVS1-IVS3, TPM1_ tel3200, PCK1_cen209-IVS6) and the control EA probe (3.3_1p36) were quantified using gradient vector flow (GVF) analysis [14,27] to validate the qualitative analysis of DA and examine the extent of variation in hybridization intensity between homologs. GVF quantified fluorescence intensity of each homolog probe hybridization in each metaphase image. The difference in probe fluorescence between homologs was calculated as a normalized integrated intensity ratio between homologs in each metaphase cell. For each sc probe, the target pair of chromosome homologs in 25 cells were analyzed (DA, n = 125 diploid cells; EA, n = 25 diploid cells). A significant difference (p < 0.0001) was determined between Table 1 Aggregated sc FISH probes and characteristics bp = base pair * Genomic position refers to the target sequence hybridized by sc FISH probe. Intergenic refers to a sc probe target between genes or outside of a gene. A gene name indicates that the sc probe target is within that given gene including exons and introns Bolded probe names indicate the anchor probe (initial DA probe identified) from which neighboring sc probes were developed, a used in tissue conservation study not domain analysis the median intensity ratio of DA (0.82) and EA (0.23) probe targets using a Mann-Whitney non-parametric test (Fig. 2B). The interquartile range for DA regions is 0.31-1.00 whereas that of the single EA region is 0.07-0.57. This trend was consistent with previous published characterization of different DA probes and multiple EA regions [14].

Relationship between open chromatin marks in interphase with mitotic accessibility characteristic of DA
Known open chromatin marks -DNase 1 hypersensitivity (DNase I HS), Formaldehyde Assisted Isolation of Regulatory Elements (FAIRE), and histone modifications (H3K4me, H3K9ac, H3K27ac, and H3K4me2) in lymphoblastoid cell line GM12878 were compared at the same genomic locations defined by the scFISH probes exhibiting DA and EA in lymphocytes during metaphase. The data reported are integrated intensity values from ENCODE data [28] reflecting chromatin accessibility during interphase (Additional file 2: Table S2). Overall, the mean integrated intensities of these 6 open chromatin marks in the newly identified DA regions (n = 17 of 18) were lower than those in equivalent accessibility (EA) sc intervals (n = 59 EA intervals, previously characterized in [14]; Additional file 3: Fig. S1A). The results, with the exception of one new DA probe, SCAMP2_IVS1, were consistent with the trend reported previously with a different set of DA probes [14]. The SCAMP2_IVS1 DA interval showed pronounced enrichment of open chromatin marks (2-75 fold difference with a mean ~ 18.4 fold increase), relative to the other DA loci in this study and those previously reported [14]. Open chromatin mark data for SCAMP2_IVS1 were excluded from the statistical analysis to prevent biased weighting of the intensity contributed by this probe sequence (Additional file 3: Fig.  S1B, C).

Aggregation of adjacent sc probes identifies DA domain organization in human metaphase homologs
To determine the extent of DA in these targeted regions, we evaluated metaphase epigenotypes of sc probes within the same chromosomal regions. Neighbouring single copy intervals, in the vicinity of DA anchor probes were scored for metaphase accessibility. When adjacent probes were scored as concordant for DA, they constituted a chromosomal DA domain. Domains are named according to the gene localized in the legacy FISH gene mapping publication from which sc probes were    in an intergenic region with a total target length of 6.3 kb. HMGB1P1 is defined by 4 DA probes, 3 from intergenic sequences adjacent to RBM38, CTCFL and PCK1 and one from within PCK1 (5' end to IVS6) with a 12.2 kb target length.
Demonstration of 3 and 4 DA intervals within COX5A and HMGB1P5 domains, respectively, (without interspersion of EA intervals) supports the possibility that long range, possibly contiguous DA regions may be common in the genome. It was not possible to delimit the full extent or contiguous nature of these domains, as the sc probes themselves did not cover the entire genomic span of the inferred domains. The COX5A domain (Fig. 3C) contained an 87.4 kb region without sc probe coverage between SCAMP2_IVS1 and COX5A_tel20181; and the HMGB1P1 domain (Fig. 3D) exhibited an 82.9 kb gap between the CTCFL_cen34302 and PCK1_cen13065 probes, and smaller gaps of < 25 kb between the other probes.

DA is conserved among different cell types
Five scFISH probe loci that exhibit DA in peripheral blood PHA-stimulated lymphocytes were also evaluated in bone marrow and fibroblast tissues, and the observed DA patterns in metaphase cells were consistent among these tissues and different individuals (Fig. 4A). These characteristic differences in probe fluorescence between metaphase homologs were observed with genic probes XDH_IVS30-IVS27, PCK1_cen209-IVS6 (Fig. 4A, left), DUOX1_IVS1-IVS3 and intergenic probes TPM1_ tel3200 (Fig. 4A, center), and CTCFL_cen34302. Individual scFISH probe analysis was generally performed on two samples for each tissue type (ie. 6 hybridizations per probe) with 25 or more metaphase cells scored per sample. The exceptions (due to sample mitotic index limitations) were for probe TPM1_tel3200 which was analyzed on a single fibroblast sample and a single bone marrow sample; and probes DUOX1_IVS1-IVS3 and XDH_IVS30-IVS27 which were each analyzed on one fibroblast sample.
Open chromatin marks were analyzed for sequences of sc probes in dermal fibroblast cell lines GM03348 (DNase I HS) and NHDF-Ad (histone modifications: H3K4me, H3K9ac, H3K27ac, and H3K4me2). The regions corresponded to each DA interval in this study and to previously reported EA intervals [14]. In fibroblasts, EA intervals showed a higher mean integrated intensity of all open chromatin marks analyzed in interphase relative to DA intervals, consistent with our results for lymphocytes both in this study (Additional file 2: Table S2) and previous DA characterizations [14]. Further, no significant difference was found between the mean integrated intensities of DA regions in lymphocytes and fibroblasts for all marks (DNase I HS p = 0.75, H3K4me p = 0.75, H3K27ac p = 0.66, and H3K4me2 p = 0.095) except one (H3K9ac p = 0.03) (Additional file 2: Table S2). FAIRE was not analyzed as data from a normal dermal fibroblast line was not available in the UNC FAIRE data set.
Using a two proportion Z-test (α = 0.05), the fraction of cells with DA for different probes was also similar between individuals for different samples (n = 6 bone marrow, 2 fibroblast, 10 PHA-stimulated peripheral blood lymphocyte). The only exception was for probe CTCFL_cen34302, in which 2 bone marrow samples had significant differences between the proportion of cells with DA (α = 0.004; 98% [44/45 cells] vs 76% [38/49 cells] of cells). The DA pattern for CTCFL_cen34302 was not different between the two lymphocyte samples or the two fibroblast samples analyzed. DA was indistinguishable between T-lymphocytes, bone marrow, and fibroblasts at each locus (p > 0.99 based on the Kruskal-Wallis test) (Fig. 4B). The proportion of cells scored as DA for each DA locus across tissues were significant using a normally distributed two-tailed binomial test (α = 0.05) (Additional file 4: Table S3). The chromosome 1p36 EA control probe also showed EA across all samples and cell types, (See figure on next page.) Fig. 4 Chromatin accessibility patterns between metaphase homologs are conserved between different cell types. A Human metaphase cells from T-lymphocyte (top row), bone marrow (center row) and fibroblast (bottom row) cells hybridized with scFISH probes PCK1_cen209-IVS6 (chr 20q13.3, left column), TPM1_tel3200 (chr 15q22, center column), and 3.3_1p36 (right column). Hybridized homologs are indicated with arrows on the metaphase cells and enlarged homologs. The differential hybridization intensity observed across all tissues at the PCK1_cen209-IVS6 (left) and TPM1_tel3200 (center) loci are characteristic of differential accessibility (DA). Equivalent hybridized probe intensities observed at locus 3.3_1p36, characteristic of equivalent accessibility (EA) are also conserved between all tissues. Probe 3.3_1p36 serves as a control as an EA locus. Chromosomes were counterstained with DAPI. Probes were labelled with digoxigenin-11-dUTP and detected with Cy3-digoxin antibody. Cells were imaged using Metasystems Axioimager Z.2 epifluorescence microscope system with Metafer4 (V3.8.12) and Isis package (V5.3) imaging software. Images presented in inverted gray scale. B Proportion of cells scored with DA (black) within lymphocytes, bone marrow cells, and fibroblasts are not significantly different from each other when hybridized with sc probes for XDH_IVS30-IVS27, DUOX1_IVS1-IVS3, PCK1_cen209-IVS6, TPM1_tel3200 and CTCFL_cen34302. Three of these DA probe targets (XDH, DUOX1, PCK1) are within genes and the other two (TPM1 and CTCFL) are within intergenic regions. Proportion of cells scored with EA (light grey) across tissue types did not significantly differ from each other when hybridized with EA probe, 3.3_1p36. Across the tissues examined for each DA or EA region, the accessibility between metaphase homologs remained the same. Sample size differs between each tissue and each probe (Additional file 4: Table S3). Significant differences were calculated using a Kruskal Wallis test (α = 0.05) comparing between tissues and proportion of cells scored as DA and EA with no difference between individuals (Fig. 4B, Additional file 4: Table S3).
DA conservation at the same loci in T-lymphocytes and bone marrow cells suggests DA is present and maintained in B-lymphocytes, as well as progenitor cells at various stages of differentiation. Fibroblasts yielded similar results suggesting DA, once established, is retained in tissues derived from both ectoderm and mesoderm germ layers.

Discussion
This study confirms and extends candidate DA regions identified by visually comparing FISH intensity differences between metaphase homolog images in legacy gene mapping publications [22][23][24][25][26]29]. Sc probes developed from these regions were assessed for DA or EA, confirming that biased hybridization intensity differences in these studies were likely the result of DA, rather than from technical aspects of hybridization to recombinant DNA-based probes. This conclusion was reinforced by probes from neighboring genomic intervals that also exhibited DA.
The chromosomal distribution and extent of adjacent differentially accessible intervals between homologs in metaphase-whether isolated or clustered in domainshad not been investigated until the present study. In interphase, differences in the epigenetic structures of an 8.16 mb region of chromosome 19 homologs showed non-random differences in accessibility and volume, whose structures were highly variable between cells [30]. We describe 6 different DA domains, XDH, FGF6, COX5A, TPM1, HMGB1P1 and HMGB1P1, of varying lengths on different chromosomes. Domains in homologous metaphase chromosomes appear to be organized as contiguous sc intervals showing differential accessibility. Furthermore, these domains appear to be conserved along mitotic chromosomes of different germline origins and hematopoietic differentiation states.
The TPM1, XDH and FGF6 domains consist of tightly clustered DA regions. By contrast, the HMGB1P5, COX5A, and HMGB1P1 domains, contain larger gaps between the DA regions confirmed using scFISH (although in some regions, probe development was constrained by the minimum lengths and densities of the single copy intervals). This raises the intriguing possibility that DA occurs more often within neighbouring single copy regions than we have previously described. Based on our previous work which identified DA in ~ 10% of scFISH probes, it seems plausible that expansion of these domains by linking adjacent short sc intervals likely increases the overall proportion of mitotic chromatin that may be subject to DA [14].
None of the DA loci defining individual domains adjoin one another. The shortest distance between DA loci is ~ 1.4 kb, and the largest ~ 87.3 kb. The major constraint in designing single copy FISH probes was related to the intrinsic distribution of repetitive elements within these regions. Sequences containing repetitive elements with divergent sequences < 20% from consensus family members were excluded from probe design to avoid nonspecific cross-hybridization across the genome [14,16,17,31]. It was also not possible to identify EA sequences flanking DA intervals, despite intensive efforts to delimit boundaries of DA domains by selecting sc intervals of increasing distances from the anchor DA probe. Previously published work from our laboratory has demonstrated that ~ 90% of single-copy probes from regions derived from clinically relevant genes/genomic regions exhibit EA [14]. For this reason, it is likely that the boundaries of the domains described here will eventually be circumscribed by adjoining sc intervals displaying EA.
Recent models of metaphase chromatin organization have suggested that oligomeric, nucleosomal, associated protein-DNA, and spacer complexes can be structured as multi-layered intercalated plates or as stacked thinlayered solenoids [32,33]. It may be possible to reconcile differences between these models by incorporating regional differences in catenation of chromatin [15]. Our proposed model of DA suggests a difference in the numbers of topoisomerase-induced supercoils, i.e. winding number, between homologs without changes in loop frequency or helical pitch [15]. Structurally, this aligns well with the proposed multi-layer plate metaphase folding model that is in equilibrium between a condensed and relaxed state [32][33][34]. This particular model also aligns best with chromosome banding and band splitting along the length of the chromosome [32,33]. However, these models cannot address why the compaction states of some allelic segments of metaphase homologs would be consistently different (ie. exhibit DA). Also, these models do not account for differences in extended DA domains in homologous chromosomes or their conservation among tissues with different origins.
The presence of DA domains in metaphase, rather than isolated DA intervals, is consistent with the proposal that these features of metaphase homologs may be correlated with or be precursors to topologically associated domains (TADs) re-established during interphase. TADs have been suggested to form coherent structural units of (primarily) cis-interacting genomic sequences in interphase chromatin [35,36]. TADs facilitate interactions with regulatory elements and their gene targets within the defined boundaries of chromatin scaffolds. These interphase organizations are almost certainly eliminated during mitosis to allow condensation of chromatin [4,5], including the loss of transcription factors important in establishing compartmentalization (e.g. CTCF) in interphase; therefore, understanding the mechanisms responsible for re-establishing these interactions in daughter cells is of considerable interest.
In this study, to assess correspondence between loci of DA in metaphase and proximate interphase TAD structures, chromatin confirmation capture information from Hi-C analysis in lymphoblast cell-line GM12878 [37] was visualized for DA domains and sc probe intervals using the 3D Genome Browser [38]. Five of the six DA domains defined in metaphase chromosomes (XDH, FGF6, COX5A, TPM1, HMGB1P1) each correspond to an interphase region contained within a single TAD (Fig. 5A-E). Domains XDH (Fig. 5A), FGF6 (Fig. 5B), and HMGB1P1 (Fig. 5D) correspond to sequences in the middle of their respective TADs, while COX5A (Fig. 5C) and TPM1 (Fig. 5E) domains approach a TAD boundary. The other DA domain, HMGB1P5, occurs between adjacent TADs proximate to one of the TAD boundaries (Fig. 5F). The extent of intra-chromosomal contacts or compartments within the corresponding DA domains (and flanking regions) is indicated by heat maps showing relative contact frequencies (Fig. 5A-F). The insets highlight the overlap of DA domains with areas of frequent localized short-range intra-chromatin interactions and looping which suggest compartments within the larger TAD structure [35,37].
The topology of interphase chromosomes based on 5C studies indicates numerous interactions between neighboring sequences within the same TAD. Interaction between neighboring DA segments has not been documented in this or previous studies. Nevertheless, the mitotic epigenotypes of these individual segments appears to be consistent with extensive condensation or catenation levels in these regions on the same homolog. Catenation differences between homologs revealed by DA may be associated with differences in chromatin folding and their association with gene expression and regulation during the subsequent interphase [6,9,35]. The preponderance of metaphase DA domains corresponding to sequences each occurring within a self-contained TAD in interphase, is consistent with DA serving as a structural link that conserves large-scale chromatin organization between mitotic and interphase chromosomes. These findings motivate a thorough genome-wide analysis of the alignment of DA domains with Hi-C chromatin conformation data underlying TAD structures.
This would clarify whether the epigenetic relationships noted here between metaphase and interphase chromatin organizations are generalizable.
FISH signal intensities in these DA domains were consistent with previous comparisons of reported DA and EA regions [14]. Epigenetic open chromatin marks of the probes in this study were also consistent with our previous analyses of other DA probes. DA loci exhibit reduced characteristics of open chromatin (DNase 1 Hypersensitivity (DNase I HS), Formaldehyde Assisted Isolation of Regulatory Elements (FAIRE), and histone modifications H3K4me, H3K9ac, H3K27ac, and H3K4me2) compared to previously described loci with equivalent accessibility [14]. We also found mean integrated intensity values of DNase I, H3K9ac, H3K27ac, and H3K4me2 at DA loci were significantly lower, except for FAIRE and H3K4me where the differences were not significant. The SCAMP2_IVS1 genomic interval was an outlier in that it was highly enriched for these open chromatin marks, which likely reflects the proximity of this probe sequence to the SCAMP2 promoter that is highly expressed in B-lymphocytes (https:// gtexp ortal. org/ home/ gene/ SCAMP2). Further, the integrated intensity per base pair of open chromatin marks analyzed for scFISH probe coverage of each domain was representative of the overall contiguous domain in lymphocytes.
Initially, DA loci and domain characterization were defined using peripheral T-lymphocytes. It is now apparent that DA at these loci is conserved in bone marrow and dermal fibroblasts as well. Three DA loci within genes and two in intergenic regions were identified in all tissues. The samples from all individuals showed DA for all probes tested, and the proportions of cells exhibiting DA with a specific probe were generally similar between samples from different individuals or tissue type (T-lymphocytes, bone marrow, fibroblasts). Analysis of (largely) interphase open chromatin marks in fibroblasts were analogous to those seen at these loci in lymphocytes (i.e., lower mean integrated intensities in new DA regions relative to EA regions). Differences between open chromatin marks in lymphocytes and fibroblasts of these new DA regions was largely unremarkable and revealed marginally significant differences only involving H3K4 acetylation. This suggests that the same epigenetic characteristics used to define DA regions during interphase in lymphocytes could also be used for fibroblasts. The mitotic cells in bone marrow would include both B lymphocytes and other progenitor cells at different stages of differentiation. That the same DA domains occur during multiple stages of hematopoiesis and from different germ layers (mesoderm: lymphocytes and ectoderm: fibroblast) suggests that establishment of DA may be an innate property of mitotic chromosome condensation. If DA is a stable chromatin mark throughout development, then its presence in both mesoderm and ectoderm derived cells would indicate its establishment in early embryogenesis. The establishment of DA structures early in mitosis distinguishes homologs and could represent a transgenerational mechanism that preserves sister chromatid identity after cell division. Such a mechanism would be consistent with our previous findings that demonstrate conservation of DA between inherited or derivative chromosomes [14].

Single copy probe design, development of probes for fluorescence in situ hybridization (FISH)
Methods for scFISH have been described previously [14][15][16][17]31]. The overall process for FISH probe development involved precise definition of each single copy (sc) interval by specific human genome coordinates and range in length from ~ 1.4 to 4 kilobases (kb). The sequence of each sc interval was amplified from human genomic DNA with polymerase chain reactions (PCR) optimized for long products, followed by the gel purification of amplicons, and labelling by nick translation with a modified nucleotide (digoxigenin-11-dUTP) prior to performing hybridization to metaphase chromosomes. Following hybridization, probes were detected with a fluorescence labeled antibody against digoxigenin on metaphase chromosomes stained with 4' ,6-diamidino-2-phenylindole (DAPI). Cells were imaged using a Metasystems computer assisted epifluorescence microscope system. Sc DNA probes were comprised of either unique DNA sequences or highly divergent repetitive sequences (> 20%) that behave as unique sequence targets during chromosomal hybridization [14][15][16][17]31]. Sc genomic intervals were excluded if they were present in copy number variants with ≥ 1% population frequency [14] and were observed in independent microarray datasets, including Ontario Population Genomics Platforms (n = 873 individuals of European ancestry; minimum 25 probes per CNV; Database of Genomic Variants), and Healthy sample set (n = ~ 400 individuals; minimum 35 probes per CNV, Affymetrix), which were used to identify common CNVs with ChAS (Chromosome Analysis Suite) software analysis of ThermoFisher (formerly Affymetrix) CytoScan HD arrays (Additional file 5: Table S4).

Oligonucleotide primer design and sc amplicon production
Primer pairs for each selected sc interval were designed using Primer-BLAST [39]. Sc intervals were identified using RepeatMasker (University of California Santa Cruz (UCSC) Genome Browser). The DNA sequence (GRCh37/hg19) for the full sc interval, obtained from the UCSC Genome Browser [40] was the PCR template used to generate all primer pair options. Generally, 15-20 primer pairs were designed for each sc interval. The maximum size of the PCR product was limited by the length of the sc interval in base pairs (bp) and the minimum length was 200-500 bp less than the maximum. The selected primer melting temperature (Tm) range was 58.0-65.0 °C, with an optimal Tm of 62.0 °C. The maximum Tm difference between a pair of primers was limited to 2 °C. Primer pair specificity was verified using the "RefSeq representation genome" database for alignment with the human genome by BLAST ® (Basic Local Alignment Search Tool) [39] as well as separate assessment by BLAT (GRCh37/hg19 and CHM13 [41]). The nucleotide coordinates of the primer pairs reported from the GRCh38/hg38 genome assembly were converted to GRCh37/hg19 coordinates using the UCSC genome browser. Optimal primer pairs minimized the self-complementarity of individual primers and the Tm difference between the pair. Primers in which the PCR product had unintended targets and generally those outside the 40-60% GC content range were avoided. Longer primers (> 25 bp) were preferred. Primer pairs were synthesized by Integrated DNA Technologies, Inc (Toronto, ON). Long PCR reaction conditions using hot start DNA polymerase Kappa HiFi (Promega Corporation) according to the manufacturer's instructions were optimized for each sc interval using a gradient PCR thermocycler (Eppendorf vapo.protect ™ Hamburg, Germany). Optimized PCR conditions were then used for scale-up of the target amplicons. The amplicons were gel purified and labelled by nick translation for use in fluorescence in situ hybridization [14,17,42]. The primer details and PCR optimization cycling parameters are provided in Additional file 6: Table S5.

Cytogenetic preparations
Cytogenetic fixed cell preparations were obtained from phytohemagglutinin (PHA)-stimulated peripheral blood, bone marrow, and dermal fibroblast samples. The cytogenetic cell preparations were derived from de-identified residual cell pellets that remained after routine cytogenetic diagnostic procedures were completed at the London Health Sciences Center Clinical Cytogenetics Laboratory (University of Western Ontario Office of Research Ethics, CER approval #5453). Cytogenetically normal cell pellets were used for bone marrow samples. Cell pellets were produced following routine cytogenetic protocols for cell culture and harvest [14] and fixed with 3 parts methanol: 1 part glacial acetic acid (Carnoy's fixative).

Sc probe selection for examining DA domains
All sc probes in Table 1 were developed in this study, with the exception of sc probe 3.3_1p36 [14,17], which is a control probe showing EA (Additional file 7, Figure S2). For each domain, these consisted of anchor probes with confirmed DA as well as multiple scFISH probes linked in the genome to these anchor sc probes. The anchor probes were designed and produced from genomic regions corresponding to published legacy chromosomal localization studies of XDH [22], HMGB1P5 and HMGB1P1 [23], FGF6 [24], TPM1 [25], and COX5A [26]. These genes map to chromosome bands 2p23 (XDH), 3p24 (HMGB1P5), 12p13 (FGF6), 15q22 (TPM1), 15q25 (COX5A), and 20q13 (HMGB1P1). Legacy publications that mapped genes on human chromosomes by FISH were identified thru PubMed and journal searches. Many of these gene mapping studies were published prior to the initial assembly of the complete human genome sequence in 2001. The 'gene mapping' FISH probes [22][23][24][25][26]29] generally consisted of recombinant DNA with long human genomic inserts that ranged in length from ~ 50 kb to several hundred kb, and in which the full genomic sequence was not known. We scrutinized the FISH images in these publications to identify potential differences in the fluorescence hybridization intensities of signals hybridizing to each chromosome homolog, which are characteristic of DA. Images that appeared to exhibit differential hybridization were further characterized in our laboratory by scFISH to determine whether the published intensity differences met our criteria for DA. The locations of the FISH probe genomic targets were determined using the probe specific gene mapping details, such as restriction enzyme mapping and partial gene sequencing in these or related publications, which were then used to computationally localize sc intervals in the current human genome assembly. Sc probes were developed from within the large genomic target regions using previously published methods [14][15][16][17][18][19][20]. If DA was determined to be present by scFISH, the sc probe then served as an anchor probe from which to develop neighboring probes. The neighboring probes were used to determine if DA extended beyond the anchor sequence and formed a larger DA domain.
All sc probes developed in this study, were hybridized to lymphocyte metaphase chromosomes to confirm the expected chromosomal band location and then scored for hybridization pattern (ie. DA or EA) as summarized below using our previously described methods [14,15,17]. Domains were named based on the HUGOapproved gene name in the corresponding legacy gene mapping publication from which the anchor probe was derived. Sc probes are named according to their location within or adjacent to the gene from which it was derived. In intergenic regions, probes are identified by the coding gene closest to the sc interval, with centromeric (cen) or telomeric (tel) indicating the position of the probe relative to that gene, and followed by the distance in nucleotides between the gene and interval. Probes localizing within genes are named with the gene and the interval of exons and introns spanned, guided by conventions stipulated by Human Genome Variation Society (HGVS) nomenclature.

Scoring differential (DA) and equivalent accessibility (EA) of sc probe hybridization between metaphase homologous chromosomes-qualitative and quantitative
Evaluation of differences in the hybridized probe fluorescence intensity between homologs was performed as previously reported [14,15]. Chromosome identification and scoring of the intensity of hybridized probe fluorescence signals (dim, medium, bright) was performed independently by a minimum of 2 analysts. A metaphase cell was considered to show differential accessibility (DA) if homologs were scored with different intensities (e.g. bright/medium, bright/dim, medium/dim, bright/nil). A cell was scored as equivalently accessible (EA) when homologs were scored with equivalent intensities (e.g. bright/bright, medium/medium). Any scores of dim/ dim, nil/nil, or dim/nil were excluded. Cells with hybridized chromosomes involved in chromosome overlap at or near the location of probe hybridization were also excluded to rule out potential hybridization effects on the targets. Twenty-five or more cells were scored for most samples, and a minimum of 2 samples were evaluated per scFISH probe for probe validation. A two-tailed binominal test with normal approximation was used to determine if there was a significant difference between the proportion of DA cells compared to that of EA cells [14]. Additionally, a two proportion Z-test was used to test if the proportion of DA cells differed between samples. Both statistical tests were performed at α = 0.05.
Visual differences in hybridized probe fluorescence intensities between homologs within the same cell were quantified using the gradient vector flow algorithm (GVF) that we previously developed [14,27]. GVF determines FISH probe boundaries for each chromosomal hybridization as a binary contour and integrates the probe fluorescence across the subset of pixels comprising each signal [27]. Integrated signal intensity for homologs 1 and 2 are defined as I1 and I2 , respectively. To determine differences between the signals of each homolog within a cell, a normalized intensity ratio was calculated: Values close to 0 indicate homologs with EA, whereas values close to 1 are differences in signal intensity present in DA [14]. A bias in hybridization signal intensities between homologous regions was reported as statistically significant using a Mann-Whitney U test.

Sc probe selection for investigating DA in different cell types
To avoid confounding factors such as differential tissue expression that could influence chromatin accessibility, sc probes were selected from within genes that had little to no expression (0.0-5.0 transcripts per million [TPM]) across all tissues of interest (lymphocytes/blasts, bone marrow, fibroblast). Expression data in TPM were downloaded from the Genotype-Tissue Expression (GTEx) [44] and Human Protein Atlas [45,46] databases. GTEx expression data were from EBV transformed lymphocytes and fibroblasts with multiple samples representing each tissue. The mean and standard deviation across samples was computed with a homebrew Python script. The Human Protein Atlas data were derived from multiple bone marrow samples and obtained as mean expression values. A subset of sc probes that demonstrated DA in T-lymphocytes developed during this study were selected to assess whether DA at these loci was conserved in bone marrow cells and fibroblasts. DA intervals present within genes (intronic and exonic) as well as in intergenic intervals, were selected to establish DA across different tissues in both gene coding and noncoding intervals. The probes selected within genes were XDH_IVS30-IVS27, PCK1_cen209-IVS6, and DUOX1_IVS1-IVS3. Intergenic DA regions that were assumed to be transcriptionally inactive from UCSC genome browser annotations included TPM1_tel3200 and CTCFL_cen34302. DUOX1_ IVS1-IVS3 sc probe (chr 15q23) genomic region was developed and validated after review of historical FISH images within a SORD gene mapping study [29].

Sequence comparison of epigenetic open chromatin marks between single copy probe genomic intervals exhibiting DA or EA
Epigenetic features characteristic of open chromatin were analyzed following the same approach that we have previously reported for other EA and DA genomic intervals [14]. The open chromatin properties extracted from the Encyclopedia of DNA Elements (ENCODE) [28] that were compared with mitotic accessibility included: DNase I hypersensitivity (Duke, Dnase1 HS), Formaldehyde-Assisted Isolation of Regulatory Elements (FAIRE) (University of North Carolina, FAIRE seq) and histone marks H3K4me1, H3K9ac, H3K27ac, and H3K4me2 (Broad Institute, histone modifications). All open chromatin marks reported were derived from data collected from the Epstein-Barr virus (EBV) transformed lymphoblastoid cell line, GM12878, in which DA had previously been characterized [14], and untransformed dermal fibroblast lines: GM03348 (DNase I HS) and NHDF-Ad (H3K4me, H3K9ac, H3K27ac, and H3K4me2). All histone modification data were derived from ChIP-seq (chromatin immunoprecipitation assay with sequencing) signal intensities. The cumulative sum of signals for each open chromatin mark was determined for all sc intervals, and a mean integrated intensity was calculated for DA and EA groups individually. Box and Whisker plots of each mark for both DA and EA visualized these distributions. Unpaired t-tests with Welch correction were used to test for significant differences (α = 0.05) between the mean integrated intensity of each chromatin mark between DA and EA intervals in lymphocytes and fibroblasts as well as integrated intensity per base pair between full DA domains and scFISH domain coverage. The open chromatin marks for new DA probes developed in this investigation were compared to previously reported EA probe intervals [14]. Open chromatin mark data for SCAMP2_IVS2 were censored from the other DA interval data set prior to statistical testing between DA and EA loci. SCAMP2_IVS1 is within intron 1, a gene segment in which promoters have been identified [47,48], which paired with the pronounced enrichment of open chromatin marks is consistent with SCAMP2_IVS1 localizing within the highly accessible SCAMP2 promoter. This sequence is not representative of the predominantly intergenic locations (n = 10) that characterize the other DA probes; therefore, SCAMP2_IVS1 was excluded from the analysis of the above interphase chromatin features, in order to prevent biased weighting of the total integrated intensities by probe sequences.

Higher order chromatin structures in DA domains
The organization of DA domain intervals with respect to higher-order chromatin structures, topologically associated domains (TADs), was analyzed using the public 3-D genome browser [38] with chromatin capture data (Hi-C) of lymphoblast cell line GM12878 [37]. Chromatin interaction frequency heatmaps were generated at a resolution of 25 kb spanning DA domain and sc probe locations (GRCh37/hg19) within the UCSC genome browser [40].
Correspondence of DA domains with TADs and other intra-TAD interactions were analyzed from scaled heatmap and genome browser outputs from the 3-D Genome Browser and UCSC Genome Browser, respectively [38,40].