Suppr超能文献

利用 Verkko 进行二倍体染色体的端粒到端粒组装。

Telomere-to-telomere assembly of diploid chromosomes with Verkko.

机构信息

Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA.

Oxford Nanopore Technologies, Oxford, UK.

出版信息

Nat Biotechnol. 2023 Oct;41(10):1474-1482. doi: 10.1038/s41587-023-01662-6. Epub 2023 Feb 16.

Abstract

The Telomere-to-Telomere consortium recently assembled the first truly complete sequence of a human genome. To resolve the most complex repeats, this project relied on manual integration of ultra-long Oxford Nanopore sequencing reads with a high-resolution assembly graph built from long, accurate PacBio high-fidelity reads. We have improved and automated this strategy in Verkko, an iterative, graph-based pipeline for assembling complete, diploid genomes. Verkko begins with a multiplex de Bruijn graph built from long, accurate reads and progressively simplifies this graph by integrating ultra-long reads and haplotype-specific markers. The result is a phased, diploid assembly of both haplotypes, with many chromosomes automatically assembled from telomere to telomere. Running Verkko on the HG002 human genome resulted in 20 of 46 diploid chromosomes assembled without gaps at 99.9997% accuracy. The complete assembly of diploid genomes is a critical step towards the construction of comprehensive pangenome databases and chromosome-scale comparative genomics.

摘要

端粒到端粒联盟最近组装了第一个真正完整的人类基因组序列。为了解决最复杂的重复序列问题,该项目依赖于超长牛津纳米孔测序reads 与高分辨率组装图谱的手动整合,该图谱由长、准确的 PacBio 高保真 reads 构建。我们在 Verkko 中改进并自动化了这种策略,Verkko 是一个用于组装完整二倍体基因组的迭代、基于图的管道。Verkko 从长、准确的reads 构建多聚体 de Bruijn 图,并通过整合超长 reads 和单倍型特异性标记物来逐步简化该图。结果是两个单倍型的相位化、二倍体组装,许多染色体从端粒自动组装到端粒。在 HG002 人类基因组上运行 Verkko 导致 20 个二倍体染色体在 99.9997%的准确率下无间隙组装。完整的二倍体基因组组装是构建全面泛基因组数据库和染色体尺度比较基因组学的关键步骤。

相似文献

1
Telomere-to-telomere assembly of diploid chromosomes with Verkko.
Nat Biotechnol. 2023 Oct;41(10):1474-1482. doi: 10.1038/s41587-023-01662-6. Epub 2023 Feb 16.
3
Semi-automated assembly of high-quality diploid human reference genomes.
Nature. 2022 Nov;611(7936):519-531. doi: 10.1038/s41586-022-05325-5. Epub 2022 Oct 19.
4
Constructing telomere-to-telomere diploid genome by polishing haploid nanopore-based assembly.
Nat Methods. 2024 Apr;21(4):574-583. doi: 10.1038/s41592-023-02141-1. Epub 2024 Mar 8.
5
SpLitteR: diploid genome assembly using TELL-Seq linked-reads and assembly graphs.
PeerJ. 2024 Sep 27;12:e18050. doi: 10.7717/peerj.18050. eCollection 2024.
6
De novo diploid genome assembly using long noisy reads.
Nat Commun. 2024 Apr 5;15(1):2964. doi: 10.1038/s41467-024-47349-7.
7
phasebook: haplotype-aware de novo assembly of diploid genomes from long reads.
Genome Biol. 2021 Oct 27;22(1):299. doi: 10.1186/s13059-021-02512-x.
8
Telomere-to-telomere assembly by preserving contained reads.
Genome Res. 2024 Nov 20;34(11):1908-1918. doi: 10.1101/gr.279311.124.
9
Gapless assembly of complete human and plant chromosomes using only nanopore sequencing.
bioRxiv. 2024 Mar 19:2024.03.15.585294. doi: 10.1101/2024.03.15.585294.
10
Genome assembly in the telomere-to-telomere era.
Nat Rev Genet. 2024 Sep;25(9):658-670. doi: 10.1038/s41576-024-00718-w. Epub 2024 Apr 22.

引用本文的文献

1
Finding easy regions for short-read variant calling from pangenome data.
Gigascience. 2025 Jan 6;14. doi: 10.1093/gigascience/giaf103.
3
Beyond the Mouse: The Mouse Lemur as a New Primate Model for Cardiovascular Research.
Curr Cardiol Rep. 2025 Aug 13;27(1):123. doi: 10.1007/s11886-025-02276-x.
5
Detecting Foldback Artifacts in Long Reads.
bioRxiv. 2025 Jul 18:2025.07.15.664946. doi: 10.1101/2025.07.15.664946.
6
Oatk: a de novo assembly tool for complex plant organelle genomes.
Genome Biol. 2025 Aug 7;26(1):235. doi: 10.1186/s13059-025-03676-6.
8
The Platinum Pedigree: a long-read benchmark for genetic variants.
Nat Methods. 2025 Aug;22(8):1669-1676. doi: 10.1038/s41592-025-02750-y. Epub 2025 Aug 4.

本文引用的文献

1
Semi-automated assembly of high-quality diploid human reference genomes.
Nature. 2022 Nov;611(7936):519-531. doi: 10.1038/s41586-022-05325-5. Epub 2022 Oct 19.
2
The genome sequence of the clay, (Fabricius, 1787).
Wellcome Open Res. 2022 Jul 4;7:177. doi: 10.12688/wellcomeopenres.17923.1. eCollection 2022.
3
Recurrent inversion polymorphisms in humans associate with genetic instability and genomic disorders.
Cell. 2022 May 26;185(11):1986-2005.e26. doi: 10.1016/j.cell.2022.04.017. Epub 2022 May 6.
4
The Human Pangenome Project: a global resource to map genomic diversity.
Nature. 2022 Apr;604(7906):437-446. doi: 10.1038/s41586-022-04601-8. Epub 2022 Apr 20.
5
Chasing perfection: validation and polishing strategies for telomere-to-telomere genome assemblies.
Nat Methods. 2022 Jun;19(6):687-695. doi: 10.1038/s41592-022-01440-3. Epub 2022 Mar 31.
6
From telomere to telomere: The transcriptional and epigenetic state of human repeat elements.
Science. 2022 Apr;376(6588):eabk3112. doi: 10.1126/science.abk3112. Epub 2022 Apr 1.
7
The complete sequence of a human genome.
Science. 2022 Apr;376(6588):44-53. doi: 10.1126/science.abj6987. Epub 2022 Mar 31.
8
Complete genomic and epigenetic maps of human centromeres.
Science. 2022 Apr;376(6588):eabl4178. doi: 10.1126/science.abl4178. Epub 2022 Apr 1.
9
Haplotype-resolved assembly of diploid genomes without parental data.
Nat Biotechnol. 2022 Sep;40(9):1332-1335. doi: 10.1038/s41587-022-01261-x. Epub 2022 Mar 24.
10
Multiplex de Bruijn graphs enable genome assembly from long, high-fidelity reads.
Nat Biotechnol. 2022 Jul;40(7):1075-1081. doi: 10.1038/s41587-022-01220-6. Epub 2022 Feb 28.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验