• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

单体型解析的多样化人类基因组和结构变异的综合分析。

Haplotype-resolved diverse human genomes and integrated analysis of structural variation.

机构信息

Heinrich Heine University, Medical Faculty, Institute for Medical Biometry and Bioinformatics, Moorenstraße 20, 40225 Düsseldorf, Germany.

Department of Genome Sciences, University of Washington School of Medicine, 3720 15th Avenue NE, Seattle, WA 98195-5065, USA.

出版信息

Science. 2021 Apr 2;372(6537). doi: 10.1126/science.abf7117. Epub 2021 Feb 25.

DOI:10.1126/science.abf7117
PMID:33632895
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8026704/
Abstract

Long-read and strand-specific sequencing technologies together facilitate the de novo assembly of high-quality haplotype-resolved human genomes without parent-child trio data. We present 64 assembled haplotypes from 32 diverse human genomes. These highly contiguous haplotype assemblies (average minimum contig length needed to cover 50% of the genome: 26 million base pairs) integrate all forms of genetic variation, even across complex loci. We identified 107,590 structural variants (SVs), of which 68% were not discovered with short-read sequencing, and 278 SV hotspots (spanning megabases of gene-rich sequence). We characterized 130 of the most active mobile element source elements and found that 63% of all SVs arise through homology-mediated mechanisms. This resource enables reliable graph-based genotyping from short reads of up to 50,340 SVs, resulting in the identification of 1526 expression quantitative trait loci as well as SV candidates for adaptive selection within the human population.

摘要

长读长和单链测序技术结合在一起,可以在没有父母-子女三核苷酸数据的情况下从头组装高质量的单倍型解析人类基因组。我们展示了 32 个不同人类基因组中 64 个组装的单倍型。这些高度连续的单倍型组装体(覆盖基因组 50%所需的最小连续体长度:2600 万碱基对)整合了所有形式的遗传变异,甚至跨越复杂的基因座。我们鉴定了 107590 个结构变异(SVs),其中 68%是无法通过短读测序发现的,还有 278 个 SV 热点(跨越大片富含基因的序列)。我们对 130 个最活跃的移动元件源元件进行了特征分析,发现所有 SVs 中有 63%是通过同源介导机制产生的。该资源可以从多达 50340 个 SV 的短读中进行可靠的基于图的基因分型,从而鉴定出 1526 个表达数量性状基因座,以及人群中适应性选择的 SV 候选基因座。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d7e1/8026704/8eee2f8e3ec0/nihms-1680320-f0006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d7e1/8026704/399f10ec29fb/nihms-1680320-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d7e1/8026704/c0c40ca25900/nihms-1680320-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d7e1/8026704/828b14acdb14/nihms-1680320-f0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d7e1/8026704/ff8ce555016b/nihms-1680320-f0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d7e1/8026704/0b29fba723ba/nihms-1680320-f0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d7e1/8026704/8eee2f8e3ec0/nihms-1680320-f0006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d7e1/8026704/399f10ec29fb/nihms-1680320-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d7e1/8026704/c0c40ca25900/nihms-1680320-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d7e1/8026704/828b14acdb14/nihms-1680320-f0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d7e1/8026704/ff8ce555016b/nihms-1680320-f0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d7e1/8026704/0b29fba723ba/nihms-1680320-f0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d7e1/8026704/8eee2f8e3ec0/nihms-1680320-f0006.jpg

相似文献

1
Haplotype-resolved diverse human genomes and integrated analysis of structural variation.单体型解析的多样化人类基因组和结构变异的综合分析。
Science. 2021 Apr 2;372(6537). doi: 10.1126/science.abf7117. Epub 2021 Feb 25.
2
Multi-platform discovery of haplotype-resolved structural variation in human genomes.多平台发现人类基因组中单体型分辨率结构变异。
Nat Commun. 2019 Apr 16;10(1):1784. doi: 10.1038/s41467-018-08148-z.
3
Pangenomics enables genotyping of known structural variants in 5202 diverse genomes.泛基因组学能够对 5202 个不同基因组中的已知结构变异进行基因分型。
Science. 2021 Dec 17;374(6574):abg8871. doi: 10.1126/science.abg8871.
4
NovoGraph: Human genome graph construction from multiple long-read assemblies.NovoGraph:基于多个长读长组装构建人类基因组图谱。
F1000Res. 2018 Sep 3;7:1391. doi: 10.12688/f1000research.15895.2. eCollection 2018.
5
Large indel detection in region-based phased diploid assemblies from linked-reads.基于连接 reads 的区域分阶段二倍体组装中的大片段插入缺失检测
BMC Genomics. 2025 Mar 18;26(Suppl 2):263. doi: 10.1186/s12864-025-11398-z.
6
Robust Benchmark Structural Variant Calls of An Asian Using State-of-the-art Long-read Sequencing Technologies.利用最先进的长读测序技术对亚洲个体进行稳健的基准结构变异调用。
Genomics Proteomics Bioinformatics. 2022 Feb;20(1):192-204. doi: 10.1016/j.gpb.2020.10.006. Epub 2021 Mar 2.
7
A high-quality human reference panel reveals the complexity and distribution of genomic structural variants.高质量的人类参考面板揭示了基因组结构变异的复杂性和分布。
Nat Commun. 2016 Oct 6;7:12989. doi: 10.1038/ncomms12989.
8
Targeted short read sequencing and assembly of re-arrangements and candidate gene loci provide megabase diplotypes.靶向短读测序和重排及候选基因座的组装提供了兆碱基的二倍体型。
Nucleic Acids Res. 2019 Nov 4;47(19):e115. doi: 10.1093/nar/gkz661.
9
Sawfish: improving long-read structural variant discovery and genotyping with local haplotype modeling.锯鳐:利用局部单倍型建模改进长读长结构变异发现和基因分型
Bioinformatics. 2025 Mar 29;41(4). doi: 10.1093/bioinformatics/btaf136.
10
VolcanoSV enables accurate and robust structural variant calling in diploid genomes from single-molecule long read sequencing.VolcanoSV 可实现基于单分子长读测序的二倍体基因组中准确稳健的结构变异 calling。
Nat Commun. 2024 Aug 13;15(1):6956. doi: 10.1038/s41467-024-51282-0.

引用本文的文献

1
Multifocal Genomic Reconstruction Leading to Germline Structural Variants.导致种系结构变异的多灶基因组重建
Methods Mol Biol. 2025;2968:509-520. doi: 10.1007/978-1-0716-4750-9_30.
2
Structural Variants: Mechanisms, Mapping, and Interpretation in Human Genetics.结构变异:人类遗传学中的机制、定位与解读
Genes (Basel). 2025 Jul 29;16(8):905. doi: 10.3390/genes16080905.
3
SV-MeCa: an XGBoost-based meta-caller approach for structural variant calling from short-read data.SV-MeCa:一种基于XGBoost的元调用方法,用于从短读长数据中进行结构变异检测。

本文引用的文献

1
Profiling variable-number tandem repeat variation across populations using repeat-pangenome graphs.利用重复泛基因组图对人群进行可变串联重复序列变异分析。
Nat Commun. 2021 Jul 12;12(1):4250. doi: 10.1038/s41467-021-24378-0.
2
lra: A long read aligner for sequences and contigs.lra:一种用于序列和重叠群的长读比对工具。
PLoS Comput Biol. 2021 Jun 21;17(6):e1009078. doi: 10.1371/journal.pcbi.1009078. eCollection 2021 Jun.
3
Haplotype-resolved de novo assembly using phased assembly graphs with hifiasm.使用带有 hifiasm 的相定装配图进行单体型解析从头组装。
BMC Bioinformatics. 2025 Aug 20;26(1):218. doi: 10.1186/s12859-025-06246-6.
4
Germline structural variations involving the pediatric brain tumor transcriptome include disease-relevant and ancestry-related genes.涉及儿科脑肿瘤转录组的种系结构变异包括与疾病相关和与祖先相关的基因。
Acta Neuropathol Commun. 2025 Aug 20;13(1):179. doi: 10.1186/s40478-025-02098-6.
5
A comparison of 27 Arabidopsis thaliana genomes and the path toward an unbiased characterization of genetic polymorphism.27个拟南芥基因组的比较以及遗传多态性无偏差表征的途径。
Nat Genet. 2025 Aug 19. doi: 10.1038/s41588-025-02293-0.
6
TRsv: simultaneous detection of tandem repeat variations, structural variations, and short indels using long read sequencing data.TRsv:利用长读长测序数据同时检测串联重复变异、结构变异和短插入缺失变异
Genome Biol. 2025 Aug 20;26(1):246. doi: 10.1186/s13059-025-03718-z.
7
Global Pangenome Analysis Highlights the Critical Role of Structural Variants in Cattle Improvement and Identifies a Unique Event as a Novel Enhancer in IGFBP7+ Cells.全球泛基因组分析凸显结构变异在牛改良中的关键作用,并鉴定出一个独特事件作为IGFBP7 +细胞中的新型增强子。
Mol Biol Evol. 2025 Sep 1;42(9). doi: 10.1093/molbev/msaf205.
8
Mechanism of parent-of-origin effects revealed by multi-omic data in euro-chinese hybrid pigs.欧洲猪与中国猪杂交后代多组学数据揭示的亲本来源效应机制
Nat Commun. 2025 Aug 14;16(1):7542. doi: 10.1038/s41467-025-62243-6.
9
From Junk DNA to Genomic Treasure: Impacts of Transposable Element DNA, RNA, and Protein in Mammalian Development and Disease.从垃圾DNA到基因组宝藏:转座元件的DNA、RNA和蛋白质在哺乳动物发育及疾病中的影响
Wiley Interdiscip Rev RNA. 2025 Jul-Aug;16(4):e70022. doi: 10.1002/wrna.70022.
10
Pangenome discovery of missing autism variants.自闭症缺失变异体的泛基因组发现。
medRxiv. 2025 Jul 22:2025.07.21.25331932. doi: 10.1101/2025.07.21.25331932.
Nat Methods. 2021 Feb;18(2):170-175. doi: 10.1038/s41592-020-01056-5. Epub 2021 Feb 1.
4
SVIM-asm: structural variant detection from haploid and diploid genome assemblies.SVIM-asm:从单倍体和二倍体基因组组装中检测结构变异
Bioinformatics. 2021 Apr 1;36(22-23):5519-5521. doi: 10.1093/bioinformatics/btaa1034.
5
Fully phased human genome assembly without parental data using single-cell strand sequencing and long reads.利用单细胞测序和长读长技术进行全相基因组组装,无需父母数据。
Nat Biotechnol. 2021 Mar;39(3):302-308. doi: 10.1038/s41587-020-0719-5. Epub 2020 Dec 7.
6
Chromosome-scale, haplotype-resolved assembly of human genomes.人类基因组的染色体规模、单倍型解析组装。
Nat Biotechnol. 2021 Mar;39(3):309-312. doi: 10.1038/s41587-020-0711-0. Epub 2020 Dec 7.
7
Bifrost: highly parallel construction and indexing of colored and compacted de Bruijn graphs.Bifrost:彩色紧凑布隆图的高度并行构建和索引
Genome Biol. 2020 Sep 17;21(1):249. doi: 10.1186/s13059-020-02135-8.
8
The GTEx Consortium atlas of genetic regulatory effects across human tissues.GTEx 联盟人类组织遗传调控效应图谱
Science. 2020 Sep 11;369(6509):1318-1330. doi: 10.1126/science.aaz1776.
9
HiCanu: accurate assembly of segmental duplications, satellites, and allelic variants from high-fidelity long reads.HiCanu:从高保真长读段中精确组装片段重复、卫星和等位基因变体。
Genome Res. 2020 Sep;30(9):1291-1305. doi: 10.1101/gr.263566.120. Epub 2020 Aug 14.
10
Population-scale proteome variation in human induced pluripotent stem cells.人类诱导多能干细胞中的全蛋白质组变异。
Elife. 2020 Aug 10;9:e57390. doi: 10.7554/eLife.57390.