Butterfield Russell J, Dunn Diane M, Duval Brett, Moldt Sarah, Weiss Robert B
Department of Pediatrics, University of Utah, Salt Lake City, UT.
Department of Neurology, University of Utah, Salt Lake City, UT.
bioRxiv. 2023 Mar 29:2023.02.17.528868. doi: 10.1101/2023.02.17.528868.
Fascioscapulohumeral muscular dystrophy (FSHD) is caused by a unique genetic mechanism that relies on contraction and hypomethylation of the D4Z4 macrosatellite array on the chromosome 4q telomere allowing ectopic expression of the DUX4 gene in skeletal muscle. Genetic analysis is difficult due to the large size and repetitive nature of the array, a nearly identical array on the 10q telomere, and the presence of divergent D4Z4 arrays scattered throughout the genome. Here, we combine nanopore long-read sequencing with Cas9-targeted enrichment of 4q and 10q D4Z4 arrays for comprehensive genetic analysis including determination of the length of the 4q and 10q D4Z4 arrays with base-pair resolution. In the same assay, we differentiate 4q from 10q telomeric sequences, determine A/B haplotype, identify paralogous D4Z4 sequences elsewhere in the genome, and estimate methylation for all CpGs in the array. Asymmetric, length-dependent methylation gradients were observed in the 4q and 10q D4Z4 arrays that reach a hypermethylation point at approximately 10 D4Z4 repeat units, consistent with the known threshold of pathogenic D4Z4 contractions. High resolution analysis of individual D4Z4 repeat methylation revealed areas of low methylation near the CTCF/insulator region and areas of high methylation immediately preceding the DUX4 transcriptional start site. Within the DUX4 exons, we observed a waxing/waning methylation pattern with a 180-nucleotide periodicity, consistent with phased nucleosomes. Targeted nanopore sequencing complements recently developed molecular combing and optical mapping approaches to genetic analysis for FSHD by adding precision of the length measurement, base-pair resolution sequencing, and quantitative methylation analysis.
面肩肱型肌营养不良症(FSHD)由一种独特的遗传机制引起,该机制依赖于4号染色体端粒上D4Z4大卫星阵列的收缩和低甲基化,从而使DUX4基因在骨骼肌中异位表达。由于该阵列的巨大规模和重复性质、10号染色体端粒上几乎相同的阵列以及散布于整个基因组中的不同D4Z4阵列的存在,遗传分析颇具难度。在此,我们将纳米孔长读长测序与针对4号和10号染色体D4Z4阵列的Cas9靶向富集相结合,以进行全面的遗传分析,包括以碱基对分辨率确定4号和10号染色体D4Z4阵列的长度。在同一项检测中,我们区分4号和10号染色体端粒序列,确定A/B单倍型,识别基因组中其他位置的同源D4Z4序列,并估计阵列中所有CpG的甲基化水平。在4号和10号染色体D4Z4阵列中观察到不对称的、长度依赖性的甲基化梯度,该梯度在大约10个D4Z4重复单元处达到高甲基化点,这与致病性D4Z4收缩的已知阈值一致。对单个D4Z4重复序列甲基化的高分辨率分析揭示了CTCF/绝缘子区域附近的低甲基化区域以及DUX4转录起始位点之前紧邻的高甲基化区域。在DUX4外显子内,我们观察到一种具有180个核苷酸周期性的增减甲基化模式,这与相位核小体一致。靶向纳米孔测序通过增加长度测量的精度、碱基对分辨率测序和定量甲基化分析,补充了最近开发的用于FSHD遗传分析的分子梳状分析和光学作图方法。