• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

用于人类和非人类全基因组测序数据混合组装的生物信息学工具的基准测试。

Benchmarking of bioinformatics tools for the hybrid assembly of human and non-human whole-genome sequencing data.

作者信息

Muñoz-Barrera Adrián, Rubio-Rodríguez Luis A, Jáspez David, Corrales Almudena, Marcelino-Rodriguez Itahisa, Ortiz Lourdes, Mendoza Pablo, Lorenzo-Salazar José M, González-Montelongo Rafaela, Flores Carlos

机构信息

Genomics Division, Instituto Tecnológico y de Energías Renovables (ITER), Santa Cruz de Tenerife, Spain.

Research Unit, Hospital Universitario Nuestra Señora de Candelaria, Instituto de Investigación Sanitaria de Canarias, Santa Cruz de Tenerife, Spain.

出版信息

Comput Struct Biotechnol J. 2025 Jul 13;27:3099-3109. doi: 10.1016/j.csbj.2025.07.020. eCollection 2025.

DOI:10.1016/j.csbj.2025.07.020
PMID:40703096
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12284544/
Abstract

Accurate and complete genome assemblies enable variant identification and the discovery of novel genomic features and biological functions. However, assemblies of large and complex genomes remain challenging. Long-read sequencing data, alone or combined with short-read data, facilitate genome assembly. However, the literature has limited comprehensive evaluations of software performance, especially for human genome assembly. We benchmarked 11 pipelines, including four long-read only assemblers and three hybrid assemblers, combined with four polishing schemes, using the HG002 human reference material sequenced with Oxford Nanopore Technologies and Illumina. The best-performing pipeline was validated with non-reference human and non-human routine laboratory samples. Software performance was assessed using QUAST, BUSCO, and Merqury metrics, alongside computational cost analyses. We found that Flye outperformed all assemblers, particularly with Ratatosk error-corrected long-reads. Polishing improved the assembly accuracy and continuity, with two rounds of Racon and Pilon yielding the best results. The assembly of data from validation samples showed comparable assembly metrics to those of the reference material. Based on the results, a complete optimal analysis pipeline for the assembly, polishing, and contig curation developed on Nextflow is provided to enable efficient parallelization and built-in dependency management to further advance the generation of high-quality and chromosome-level assemblies.

摘要

准确完整的基因组组装能够实现变异识别以及发现新的基因组特征和生物学功能。然而,大型复杂基因组的组装仍然具有挑战性。长读长测序数据单独使用或与短读长数据结合使用,有助于基因组组装。然而,文献中对软件性能的全面评估有限,尤其是对于人类基因组组装。我们使用牛津纳米孔技术公司和Illumina测序的HG002人类参考材料,对11种流程进行了基准测试,其中包括4种仅使用长读长的组装器和3种混合组装器,并结合了4种优化方案。性能最佳的流程使用非参考人类和非人类常规实验室样本进行了验证。使用QUAST、BUSCO和Merqury指标评估软件性能,并进行计算成本分析。我们发现Flye优于所有组装器,特别是在使用Ratatosk纠错长读长时。优化提高了组装的准确性和连续性,两轮Racon和Pilon优化产生了最佳结果。验证样本数据的组装显示出与参考材料相当的组装指标。基于这些结果,提供了一个在Nextflow上开发的用于组装、优化和重叠群整理的完整最佳分析流程,以实现高效并行化和内置依赖管理,从而进一步推动高质量和染色体水平组装的生成。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2f52/12284544/157cb831516e/gr4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2f52/12284544/c007f1dc6cd0/ga1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2f52/12284544/eaf19c50dc1a/gr1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2f52/12284544/528eb4874739/gr2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2f52/12284544/4fc6666e0d11/gr3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2f52/12284544/157cb831516e/gr4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2f52/12284544/c007f1dc6cd0/ga1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2f52/12284544/eaf19c50dc1a/gr1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2f52/12284544/528eb4874739/gr2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2f52/12284544/4fc6666e0d11/gr3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2f52/12284544/157cb831516e/gr4.jpg

相似文献

1
Benchmarking of bioinformatics tools for the hybrid assembly of human and non-human whole-genome sequencing data.用于人类和非人类全基因组测序数据混合组装的生物信息学工具的基准测试。
Comput Struct Biotechnol J. 2025 Jul 13;27:3099-3109. doi: 10.1016/j.csbj.2025.07.020. eCollection 2025.
2
An open-source nanopore-only sequencing workflow for analysis of clonal outbreaks delivers short-read level accuracy.一种用于分析克隆性暴发的仅基于纳米孔的开源测序工作流程可实现短读长水平的准确性。
J Clin Microbiol. 2025 Jul 18:e0066425. doi: 10.1128/jcm.00664-25.
3
Accurate and reproducible whole-genome genotyping for bacterial genomic surveillance with Nanopore sequencing data.利用纳米孔测序数据进行细菌基因组监测的准确且可重复的全基因组基因分型。
J Clin Microbiol. 2025 Jul 9;63(7):e0036925. doi: 10.1128/jcm.00369-25. Epub 2025 Jun 13.
4
Comparison of Illumina and Oxford Nanopore Technology systems for the genomic characterization of .用于……基因组特征分析的Illumina和牛津纳米孔技术系统的比较
Microbiol Spectr. 2025 Jul;13(7):e0129424. doi: 10.1128/spectrum.01294-24. Epub 2025 May 28.
5
Illumina complete long read assay yields contiguous bacterial genomes from human gut metagenomes.Illumina全基因组长读长检测可从人类肠道宏基因组中获得连续的细菌基因组。
mSystems. 2025 Jul 23:e0153124. doi: 10.1128/msystems.01531-24.
6
Decontamination of DNA sequences from a Streptomyces genome for optimal genome mining.对链霉菌基因组中的DNA序列进行净化以实现最佳基因组挖掘。
Braz J Microbiol. 2025 Mar;56(1):79-89. doi: 10.1007/s42770-024-01598-2. Epub 2025 Jan 15.
7
Can a Liquid Biopsy Detect Circulating Tumor DNA With Low-passage Whole-genome Sequencing in Patients With a Sarcoma? A Pilot Evaluation.液体活检能否通过低深度全基因组测序检测肉瘤患者的循环肿瘤DNA?一项初步评估。
Clin Orthop Relat Res. 2025 Jan 1;483(1):39-48. doi: 10.1097/CORR.0000000000003161. Epub 2024 Jun 21.
8
Lessons learned: overcoming common challenges in reconstructing the SARS-CoV-2 genome from short-read sequencing data via CoVpipe2.经验教训:通过CoVpipe2从短读长测序数据重建严重急性呼吸综合征冠状病毒2(SARS-CoV-2)基因组时克服常见挑战。
F1000Res. 2024 Apr 16;12:1091. doi: 10.12688/f1000research.136683.1. eCollection 2023.
9
Chromosome level de Novo hybrid assembly of Asian honeybee, Apis cerana Koreana.亚洲蜜蜂(Apis cerana Koreana)的染色体水平从头杂交组装
Sci Rep. 2025 Jul 24;15(1):26912. doi: 10.1038/s41598-025-12338-3.
10
Enhancing public health surveillance: a comparative study of platform-specific and hybrid assembly approaches in SARS-CoV-2 genome sequencing.加强公共卫生监测:SARS-CoV-2基因组测序中特定平台和混合组装方法的比较研究
Microb Genom. 2025 Jul;11(7). doi: 10.1099/mgen.0.001357.

本文引用的文献

1
Verkko2 integrates proximity-ligation data with long-read De Bruijn graphs for efficient telomere-to-telomere genome assembly, phasing, and scaffolding.Verkko2将邻近连接数据与长读长德布鲁因图相结合,以实现高效的端粒到端粒基因组组装、定相和支架搭建。
Genome Res. 2025 Jun 12. doi: 10.1101/gr.280383.124.
2
Gapless assembly of complete human and plant chromosomes using only nanopore sequencing.仅用纳米孔测序技术实现完整人类和植物染色体的无缝组装。
Genome Res. 2024 Nov 20;34(11):1919-1930. doi: 10.1101/gr.279334.124.
3
Scalable telomere-to-telomere assembly for diploid and polyploid genomes with double graph.
使用双图进行二倍体和多倍体基因组的可扩展端粒到端粒组装。
Nat Methods. 2024 Jun;21(6):967-970. doi: 10.1038/s41592-024-02269-8. Epub 2024 May 10.
4
De novo diploid genome assembly using long noisy reads.从头组装具有长噪声读长的二倍体基因组。
Nat Commun. 2024 Apr 5;15(1):2964. doi: 10.1038/s41467-024-47349-7.
5
Quality assessment of gene repertoire annotations with OMArk.使用OMArk对基因库注释进行质量评估。
Nat Biotechnol. 2025 Jan;43(1):124-133. doi: 10.1038/s41587-024-02147-w. Epub 2024 Feb 21.
6
Time- and memory-efficient genome assembly with Raven.使用Raven进行高效省时的基因组组装。
Nat Comput Sci. 2021 May;1(5):332-336. doi: 10.1038/s43588-021-00073-4. Epub 2021 May 20.
7
Evaluating long-read de novo assembly tools for eukaryotic genomes: insights and considerations.评估真核生物基因组的长读长从头组装工具:见解与考虑。
Gigascience. 2022 Dec 28;12. doi: 10.1093/gigascience/giad100. Epub 2023 Nov 24.
8
Benchmarking multi-platform sequencing technologies for human genome assembly.多平台测序技术在人类基因组组装中的基准测试。
Brief Bioinform. 2023 Sep 20;24(5). doi: 10.1093/bib/bbad300.
9
Integration of hybrid and self-correction method improves the quality of long-read sequencing data.混合和自校正方法的整合提高了长读测序数据的质量。
Brief Funct Genomics. 2024 May 15;23(3):249-255. doi: 10.1093/bfgp/elad026.
10
Telomere-to-telomere assembly of diploid chromosomes with Verkko.利用 Verkko 进行二倍体染色体的端粒到端粒组装。
Nat Biotechnol. 2023 Oct;41(10):1474-1482. doi: 10.1038/s41587-023-01662-6. Epub 2023 Feb 16.