Suppr超能文献

单体型解析的多样化人类基因组和结构变异的综合分析。

Haplotype-resolved diverse human genomes and integrated analysis of structural variation.

机构信息

Heinrich Heine University, Medical Faculty, Institute for Medical Biometry and Bioinformatics, Moorenstraße 20, 40225 Düsseldorf, Germany.

Department of Genome Sciences, University of Washington School of Medicine, 3720 15th Avenue NE, Seattle, WA 98195-5065, USA.

出版信息

Science. 2021 Apr 2;372(6537). doi: 10.1126/science.abf7117. Epub 2021 Feb 25.

Abstract

Long-read and strand-specific sequencing technologies together facilitate the de novo assembly of high-quality haplotype-resolved human genomes without parent-child trio data. We present 64 assembled haplotypes from 32 diverse human genomes. These highly contiguous haplotype assemblies (average minimum contig length needed to cover 50% of the genome: 26 million base pairs) integrate all forms of genetic variation, even across complex loci. We identified 107,590 structural variants (SVs), of which 68% were not discovered with short-read sequencing, and 278 SV hotspots (spanning megabases of gene-rich sequence). We characterized 130 of the most active mobile element source elements and found that 63% of all SVs arise through homology-mediated mechanisms. This resource enables reliable graph-based genotyping from short reads of up to 50,340 SVs, resulting in the identification of 1526 expression quantitative trait loci as well as SV candidates for adaptive selection within the human population.

摘要

长读长和单链测序技术结合在一起,可以在没有父母-子女三核苷酸数据的情况下从头组装高质量的单倍型解析人类基因组。我们展示了 32 个不同人类基因组中 64 个组装的单倍型。这些高度连续的单倍型组装体(覆盖基因组 50%所需的最小连续体长度:2600 万碱基对)整合了所有形式的遗传变异,甚至跨越复杂的基因座。我们鉴定了 107590 个结构变异(SVs),其中 68%是无法通过短读测序发现的,还有 278 个 SV 热点(跨越大片富含基因的序列)。我们对 130 个最活跃的移动元件源元件进行了特征分析,发现所有 SVs 中有 63%是通过同源介导机制产生的。该资源可以从多达 50340 个 SV 的短读中进行可靠的基于图的基因分型,从而鉴定出 1526 个表达数量性状基因座,以及人群中适应性选择的 SV 候选基因座。

相似文献

引用本文的文献

10
Pangenome discovery of missing autism variants.自闭症缺失变异体的泛基因组发现。
medRxiv. 2025 Jul 22:2025.07.21.25331932. doi: 10.1101/2025.07.21.25331932.

本文引用的文献

2
lra: A long read aligner for sequences and contigs.lra:一种用于序列和重叠群的长读比对工具。
PLoS Comput Biol. 2021 Jun 21;17(6):e1009078. doi: 10.1371/journal.pcbi.1009078. eCollection 2021 Jun.
6
Chromosome-scale, haplotype-resolved assembly of human genomes.人类基因组的染色体规模、单倍型解析组装。
Nat Biotechnol. 2021 Mar;39(3):309-312. doi: 10.1038/s41587-020-0711-0. Epub 2020 Dec 7.

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验