• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

简化基因组监测:针对 HIV-1 和其他病原性病毒的多菌株混合数据,对长读长组装器进行全面性能评估,以构建用户友好的生物信息学管道。

Easing genomic surveillance: A comprehensive performance evaluation of long-read assemblers across multi-strain mixture data of HIV-1 and Other pathogenic viruses for constructing a user-friendly bioinformatic pipeline.

机构信息

Department of Microbiology, Faculty of Medicine, Chiang Mai University, Chiang Mai, 50200, Thailand.

出版信息

F1000Res. 2024 May 31;13:556. doi: 10.12688/f1000research.149577.1. eCollection 2024.

DOI:10.12688/f1000research.149577.1
PMID:38984017
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11231628/
Abstract

BACKGROUND

Determining the appropriate computational requirements and software performance is essential for efficient genomic surveillance. The lack of standardized benchmarking complicates software selection, especially with limited resources.

METHODS

We developed a containerized benchmarking pipeline to evaluate seven long-read assemblers-Canu, GoldRush, MetaFlye, Strainline, HaploDMF, iGDA, and RVHaplo-for viral haplotype reconstruction, using both simulated and experimental Oxford Nanopore sequencing data of HIV-1 and other viruses. Benchmarking was conducted on three computational systems to assess each assembler's performance, utilizing QUAST and BLASTN for quality assessment.

RESULTS

Our findings show that assembler choice significantly impacts assembly time, with CPU and memory usage having minimal effect. Assembler selection also influences the size of the contigs, with a minimum read length of 2,000 nucleotides required for quality assembly. A 4,000-nucleotide read length improves quality further. Canu was efficient among assemblers but not suitable for multi-strain mixtures, while GoldRush produced only consensus assemblies. Strainline and MetaFlye were suitable for metagenomic sequencing data, with Strainline requiring high memory and MetaFlye operable on low-specification machines. Among reference-based assemblers, iGDA had high error rates, RVHaplo showed the best runtime and accuracy but became ineffective with similar sequences, and HaploDMF, utilizing machine learning, had fewer errors with a slightly longer runtime.

CONCLUSIONS

The HIV-64148 pipeline, containerized using Docker, facilitates easy deployment and offers flexibility to select from a range of assemblers to match computational systems or study requirements. This tool aids in genome assembly and provides valuable information on HIV-1 sequences, enhancing viral evolution monitoring and understanding.

摘要

背景

确定适当的计算要求和软件性能对于高效的基因组监测至关重要。缺乏标准化的基准测试使得软件选择变得复杂,尤其是在资源有限的情况下。

方法

我们开发了一个容器化的基准测试管道,用于评估七种长读长组装器-Canu、GoldRush、MetaFlye、Strainline、HaploDMF、iGDA 和 RVHaplo-用于病毒单倍型重建,使用模拟和实验性的牛津纳米孔测序数据 HIV-1 和其他病毒。在三个计算系统上进行基准测试,以评估每个组装器的性能,使用 QUAST 和 BLASTN 进行质量评估。

结果

我们的研究结果表明,组装器的选择显著影响组装时间,而 CPU 和内存使用的影响最小。组装器的选择也会影响 contigs 的大小,需要至少 2000 个核苷酸的最小读取长度才能进行高质量的组装。4000 个核苷酸的读取长度可以进一步提高质量。Canu 在组装器中效率较高,但不适合多菌株混合物,而 GoldRush 仅产生共识组装。Strainline 和 MetaFlye 适用于宏基因组测序数据,Strainline 需要高内存,MetaFlye 可在低规格机器上运行。在基于参考的组装器中,iGDA 错误率较高,RVHaplo 运行时和准确性最好,但在相似序列下效果不佳,而利用机器学习的 HaploDMF 错误较少,运行时间略长。

结论

使用 Docker 容器化的 HIV-64148 管道便于轻松部署,并提供了从一系列组装器中进行选择的灵活性,以匹配计算系统或研究要求。该工具有助于基因组组装,并提供有关 HIV-1 序列的有价值信息,增强了病毒进化监测和理解。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/07b2/11231628/96ebfc69e8d3/f1000research-13-164057-g0011.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/07b2/11231628/018a5609fc5a/f1000research-13-164057-g0000.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/07b2/11231628/7da12fc73b9c/f1000research-13-164057-g0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/07b2/11231628/5c30ccd91307/f1000research-13-164057-g0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/07b2/11231628/63fc85d932f6/f1000research-13-164057-g0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/07b2/11231628/b19a5eb0abb5/f1000research-13-164057-g0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/07b2/11231628/72e7086027a8/f1000research-13-164057-g0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/07b2/11231628/f47d17cf842c/f1000research-13-164057-g0006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/07b2/11231628/b7bf583867c1/f1000research-13-164057-g0007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/07b2/11231628/1b8934e2e824/f1000research-13-164057-g0008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/07b2/11231628/1ae817068542/f1000research-13-164057-g0009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/07b2/11231628/26620c67755e/f1000research-13-164057-g0010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/07b2/11231628/96ebfc69e8d3/f1000research-13-164057-g0011.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/07b2/11231628/018a5609fc5a/f1000research-13-164057-g0000.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/07b2/11231628/7da12fc73b9c/f1000research-13-164057-g0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/07b2/11231628/5c30ccd91307/f1000research-13-164057-g0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/07b2/11231628/63fc85d932f6/f1000research-13-164057-g0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/07b2/11231628/b19a5eb0abb5/f1000research-13-164057-g0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/07b2/11231628/72e7086027a8/f1000research-13-164057-g0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/07b2/11231628/f47d17cf842c/f1000research-13-164057-g0006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/07b2/11231628/b7bf583867c1/f1000research-13-164057-g0007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/07b2/11231628/1b8934e2e824/f1000research-13-164057-g0008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/07b2/11231628/1ae817068542/f1000research-13-164057-g0009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/07b2/11231628/26620c67755e/f1000research-13-164057-g0010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/07b2/11231628/96ebfc69e8d3/f1000research-13-164057-g0011.jpg

相似文献

1
Easing genomic surveillance: A comprehensive performance evaluation of long-read assemblers across multi-strain mixture data of HIV-1 and Other pathogenic viruses for constructing a user-friendly bioinformatic pipeline.简化基因组监测:针对 HIV-1 和其他病原性病毒的多菌株混合数据,对长读长组装器进行全面性能评估,以构建用户友好的生物信息学管道。
F1000Res. 2024 May 31;13:556. doi: 10.12688/f1000research.149577.1. eCollection 2024.
2
Benchmarking Long-Read Assemblers for Genomic Analyses of Bacterial Pathogens Using Oxford Nanopore Sequencing.基于 Oxford Nanopore 测序的细菌病原体基因组分析的长读长组装器基准测试
Int J Mol Sci. 2020 Dec 1;21(23):9161. doi: 10.3390/ijms21239161.
3
Benchmarking of de novo assembly algorithms for Nanopore data reveals optimal performance of OLC approaches.用于纳米孔数据的从头组装算法基准测试揭示了重叠布局一致(OLC)方法的最佳性能。
BMC Genomics. 2016 Aug 22;17 Suppl 7(Suppl 7):507. doi: 10.1186/s12864-016-2895-8.
4
Benchmarking and Assessment of Eight Genome Assemblers on Viral Next-Generation Sequencing Data, Including the SARS-CoV-2.对包括 SARS-CoV-2 在内的病毒下一代测序数据的八种基因组组装器的基准测试和评估。
OMICS. 2022 Jul;26(7):372-381. doi: 10.1089/omi.2022.0042. Epub 2022 Jun 28.
5
Comparison of assembly using long-read shotgun metagenomic sequencing of viruses in fecal and serum samples from marine mammals.利用长读长鸟枪法宏基因组测序对海洋哺乳动物粪便和血清样本中的病毒进行组装的比较。
Front Microbiol. 2023 Sep 22;14:1248323. doi: 10.3389/fmicb.2023.1248323. eCollection 2023.
6
Benchmarking of long-read assemblers for prokaryote whole genome sequencing.原核生物全基因组测序的长读长组装器基准测试。
F1000Res. 2019 Dec 23;8:2138. doi: 10.12688/f1000research.21782.4. eCollection 2019.
7
Viral deep sequencing needs an adaptive approach: IRMA, the iterative refinement meta-assembler.病毒深度测序需要一种适应性方法:IRMA,即迭代优化元组装器。
BMC Genomics. 2016 Sep 5;17(1):708. doi: 10.1186/s12864-016-3030-6.
8
LMAS: evaluating metagenomic short de novo assembly methods through defined communities.LMAS:通过定义的群落评估宏基因组短从头组装方法。
Gigascience. 2022 Dec 28;12. doi: 10.1093/gigascience/giac122.
9
Comparative Evaluation of Genome Assemblers from Long-Read Sequencing for Plants and Crops.比较长读长测序组装植物和作物基因组的基因组程序。
J Agric Food Chem. 2020 Jul 22;68(29):7670-7677. doi: 10.1021/acs.jafc.0c01647. Epub 2020 Jul 10.
10
Validation of Variant Assembly Using HAPHPIPE with Next-Generation Sequence Data from Viruses.使用 HAPHPIPE 对病毒的下一代序列数据进行变体组装验证。
Viruses. 2020 Jul 14;12(7):758. doi: 10.3390/v12070758.

引用本文的文献

1
Deciphering vancomycin resistance in : gene distribution, sequence typing, and global phylogenetic analysis.解析万古霉素耐药性:基因分布、序列分型及全球系统发育分析
Front Microbiol. 2025 Aug 15;16:1578903. doi: 10.3389/fmicb.2025.1578903. eCollection 2025.

本文引用的文献

1
Benchmarking short and long read polishing tools for nanopore assemblies: achieving near-perfect genomes for outbreak isolates.针对纳米孔组装的短读和长读抛光工具进行基准测试:实现暴发分离株的近乎完美基因组。
BMC Genomics. 2024 Jul 8;25(1):679. doi: 10.1186/s12864-024-10582-x.
2
Using a mobile nanopore sequencing lab for end-to-end genomic surveillance of Plasmodium falciparum: A feasibility study.使用移动纳米孔测序实验室对恶性疟原虫进行端到端基因组监测:一项可行性研究。
PLOS Glob Public Health. 2024 Feb 1;4(2):e0002743. doi: 10.1371/journal.pgph.0002743. eCollection 2024.
3
Innovations in genomic antimicrobial resistance surveillance.
基因组抗菌药物耐药性监测的创新。
Lancet Microbe. 2023 Dec;4(12):e1063-e1070. doi: 10.1016/S2666-5247(23)00285-9. Epub 2023 Nov 14.
4
Long-Read Sequencing with Hierarchical Clustering for Antiretroviral Resistance Profiling of Mixed Human Immunodeficiency Virus Quasispecies.长读测序与层次聚类在混合人类免疫缺陷病毒准种的抗逆转录病毒耐药性分析中的应用。
Clin Chem. 2023 Oct 3;69(10):1174-1185. doi: 10.1093/clinchem/hvad108.
5
Linear time complexity de novo long read genome assembly with GoldRush.使用 GoldRush 进行具有线性时间复杂度的从头长读基因组组装。
Nat Commun. 2023 May 22;14(1):2906. doi: 10.1038/s41467-023-38716-x.
6
Nano-DMS-MaP allows isoform-specific RNA structure determination.纳米 DMS-MaP 可实现异构体特异性 RNA 结构测定。
Nat Methods. 2023 Jun;20(6):849-859. doi: 10.1038/s41592-023-01862-7. Epub 2023 Apr 27.
7
Benchmarking of Nanopore R10.4 and R9.4.1 flow cells in single-cell whole-genome amplification and whole-genome shotgun sequencing.纳米孔R10.4和R9.4.1流动槽在单细胞全基因组扩增和全基因组鸟枪法测序中的基准测试
Comput Struct Biotechnol J. 2023 Mar 24;21:2352-2364. doi: 10.1016/j.csbj.2023.03.038. eCollection 2023.
8
Characterization and simulation of metagenomic nanopore sequencing data with Meta-NanoSim.利用 Meta-NanoSim 对宏基因组纳米孔测序数据进行特征描述和模拟。
Gigascience. 2023 Mar 20;12. doi: 10.1093/gigascience/giad013.
9
Toward a global virus genomic surveillance network.建立全球病毒基因组监测网络。
Cell Host Microbe. 2023 Jun 14;31(6):861-873. doi: 10.1016/j.chom.2023.03.003. Epub 2023 Mar 6.
10
Nanopore-only assemblies for genomic surveillance of the global priority drug-resistant pathogen, .用于全球优先耐药病原体基因组监测的仅纳米孔组装体。
Microb Genom. 2023 Feb;9(2). doi: 10.1099/mgen.0.000936.