Suppr超能文献

基于长读长测序的1019名不同个体的结构变异

Structural variation in 1,019 diverse humans based on long-read sequencing.

作者信息

Schloissnig Siegfried, Pani Samarendra, Ebler Jana, Hain Carsten, Tsapalou Vasiliki, Söylev Arda, Hüther Patrick, Ashraf Hufsah, Prodanov Timofey, Asparuhova Mila, Magalhães Hugo, Höps Wolfram, Sotelo-Fonseca Jesus Emiliano, Fitzgerald Tomas, Santana-Garcia Walter, Moreira-Pinhal Ricardo, Hunt Sarah, Pérez-Llanos Francy J, Wollenweber Tassilo Erik, Sivalingam Sugirthan, Wieczorek Dagmar, Cáceres Mario, Gilissen Christian, Birney Ewan, Ding Zhihao, Jensen Jan Nygaard, Podduturi Nikhil, Stutzki Jan, Rodriguez-Martin Bernardo, Rausch Tobias, Marschall Tobias, Korbel Jan O

机构信息

Research Institute of Molecular Pathology (IMP), Vienna BioCenter (VBC), Vienna, Austria.

Institute for Medical Biometry and Bioinformatics, Medical Faculty and University Hospital Düsseldorf, Heinrich Heine University Düsseldorf, Düsseldorf, Germany.

出版信息

Nature. 2025 Jul 23. doi: 10.1038/s41586-025-09290-7.

Abstract

Genomic structural variants (SVs) contribute substantially to genetic diversity and human diseases, yet remain under-characterized in population-scale cohorts. Here we conducted long-read sequencing in 1,019 humans to construct an intermediate-coverage resource covering 26 populations from the 1000 Genomes Project. Integrating linear and graph genome-based analyses, we uncover over 100,000 sequence-resolved biallelic SVs and we genotype 300,000 multiallelic variable number of tandem repeats, advancing SV characterization over short-read-based population-scale surveys. We characterize deletions, duplications, insertions and inversions in distinct populations. Long interspersed nuclear element-1 (L1) and SINE-VNTR-Alu (SVA) retrotransposition activities mediate the transduction of unique sequence stretches in 5' or 3', depending on source mobile element class and locus. SV breakpoint analyses point to a spectrum of homology-mediated processes contributing to SV formation and recurrent deletion events. Our open-access resource underscores the value of long-read sequencing in advancing SV characterization and enables guiding variant prioritization in patient genomes.

摘要

基因组结构变异(SVs)对遗传多样性和人类疾病有重大贡献,但在群体规模队列研究中仍未得到充分表征。在此,我们对1019名个体进行了长读长测序,构建了一个中等覆盖度的资源库,涵盖来自千人基因组计划的26个群体。整合基于线性和图形基因组的分析方法,我们发现了超过100,000个序列解析的双等位基因SVs,并对300,000个多等位基因可变串联重复序列进行了基因分型,相较于基于短读长的群体规模调查,推进了SVs的表征。我们对不同群体中的缺失、重复、插入和倒位进行了表征。长散在核元件1(L1)和短散在核元件-可变数目串联重复序列-铝(SVA)逆转录转座活动介导了独特序列片段在5'或3'端的转导,这取决于来源移动元件类别和位点。SV断点分析指出了一系列同源性介导的过程,这些过程促成了SV的形成和反复出现的缺失事件。我们的开放获取资源强调了长读长测序在推进SV表征方面的价值,并能够指导患者基因组中变异的优先级排序。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验