自闭症缺失变异体的泛基因组发现。
Pangenome discovery of missing autism variants.
作者信息
Sui Yang, Lin Jiadong, Noyes Michelle D, Kwon Youngjun, Wong Isaac, Koundinya Nidhi, Harvey William T, Wu Mei, Hoekzema Kendra, Munson Katherine M, Garcia Gage H, Knuth Jordan, Wertz Julie, Wang Tianyun, Hennick Kelsey, Karunakaran Druha, Polo Prieto Rafael A, Meyer-Schuman Rebecca, Cherry Fisher, Pehlivan Davut, Suter Bernhard, Gustafson Jonas A, Miller Danny E, Berk-Rauch Hanna, Nowakowski Tomasz J, Chakravarti Aravinda, Zoghbi Huda Y, Eichler Evan E
机构信息
Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA.
Department of Medical Genetics, Center for Medical Genetics, School of Basic Medical Sciences, Peking University Health Science Center, Beijing 100191, China.
出版信息
medRxiv. 2025 Jul 22:2025.07.21.25331932. doi: 10.1101/2025.07.21.25331932.
Autism spectrum disorders (ASDs) are genetically and phenotypically heterogeneous and the majority of cases still remain genetically unresolved. To better understand large-effect pathogenic variation, we generated long-read sequencing data to construct phased and near-complete genome assemblies (average contig N50=43 Mbp, QV=56) for 189 individuals from 51 families with unsolved cases of autism. We applied read- and assembly-based strategies to facilitate comprehensive characterization of mutations (DNMs), structural variants (SVs), and DNA methylation profiles. Merging common SVs obtained from long-read pangenome controls, we efficiently filtered >97% of common SVs exclusive to 87 offspring. We find no evidence of increased autosomal SV burden for probands when compared to unaffected siblings yet note a trend for an increase of SV burden on the X chromosome among affected females. We establish a workflow to prioritize potential pathogenic variants by integrating autism risk genes and putative noncoding regulatory elements defined from ATAC-seq and CUT&Tag data from the developing cortex. In total, we identified three pathogenic variants in , , and , as well as nine candidate and biparental homozygous SVs, most of which were missed by short-read sequencing. Our work highlights the potential of phased genomes to discover complex more pathogenic mutations and the power of the pangenome to restrict the focus on an increasingly smaller number of SVs for clinical evaluation.
自闭症谱系障碍(ASD)在遗传和表型上具有异质性,大多数病例的遗传原因仍未明确。为了更好地理解具有较大影响的致病变异,我们生成了长读长测序数据,为来自51个自闭症未确诊家庭的189名个体构建了定相且近乎完整的基因组组装(平均重叠群N50 = 43 Mbp,QV = 56)。我们应用基于 reads 和组装的策略,以促进对突变(DNM)、结构变异(SV)和DNA甲基化谱的全面表征。合并从长读长泛基因组对照中获得的常见SV,我们有效地过滤了87名后代特有的超过97%的常见SV。与未受影响的兄弟姐妹相比,我们没有发现先证者常染色体SV负担增加的证据,但注意到受影响女性中X染色体上的SV负担有增加的趋势。我们建立了一个工作流程,通过整合自闭症风险基因以及从发育中的皮质的ATAC-seq和CUT&Tag数据定义的假定非编码调控元件,对潜在的致病变异进行优先级排序。我们总共在[具体基因名称1]、[具体基因名称2]和[具体基因名称3]中鉴定出三个致病变异,以及九个候选[具体类型]和双亲纯合SV,其中大多数被短读长测序遗漏。我们的工作突出了定相基因组在发现更复杂致病突变方面的潜力,以及泛基因组在将临床评估的重点限制在越来越少的SV上的能力。