倭黑猩猩流程：基于纳米孔测序 reads 的病毒基因组组装与单倍型重建

BonoboFlow: viral genome assembly and haplotype reconstruction from nanopore reads.

作者信息

Ndekezi Christian, Byamukama Drake, Kato Frank, Omara Denis, Nakyanzi Angella, Natwijuka Fortunate, Mugaba Susan, Ssekagiri Alfred, Bbosa Nicholas, Sande Obondo James, Kimuda Magambo Phillip, Byarugaba Denis K, Kapaata Anne, Sutar Jyoti, Bhattacharya Jayanta, Kaleebu Pontiano, Balinda Sheila N

机构信息

Medical Research Council/Uganda Virus Research Institute & London School of Hygiene and Tropical Medicine (MRC), Entebbe, P.O. Box 49, Uganda.

College of Health Sciences, department of Immunology and Molecular Biology, Makerere University, Kampala, P.O. Box 7062, Uganda.

出版信息

Bioinform Adv. 2025 May 13;5(1):vbaf115. doi: 10.1093/bioadv/vbaf115. eCollection 2025.

DOI:10.1093/bioadv/vbaf115

PMID:40487929

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12141814/

Abstract

SUMMARY

Viral genome sequencing and analysis are crucial for understanding the diversity and evolution of viruses. Traditional Sanger sequencing is limited by low sequence depth and is labor intensive. Next-Generation Sequencing (NGS) methods, such as Illumina, offer improved sequencing depth and throughput but face challenges with accurate reconstruction of viral genomes due to genome fragmentation. Third-generation sequencing platforms, such as PacBio and Oxford Nanopore Technologies (ONT), generate long reads with high throughput. However, PacBio is constrained by substantial resource requirements, while ONT suffers from inherently high error rates. Moreover, standardized pipelines for ONT sequencing encompassing basecalling to genome assembly remain limited.

RESULTS

Here, we introduce BonoboFlow, a standardized Nextflow pipeline designed to streamline ONT-based viral genome assembly/haplotype reconstruction. BonoboFlow integrates key processing steps, including basecalling, read filtering, chimeric read removal, error correction, draft genome assembly/haplotype reconstruction, and genome polishing. The pipeline accepts raw POD5 or basecalled FASTQ files as input, produces FASTA consensus files as output, and uses a reference genome (in FASTA format) for contaminant read filtering. BonoboFlow's containerized implementation via Docker and Singularity ensures seamless deployment across diverse computing environments. While BonoboFlow excels in assembling small and medium viral genomes, it showed challenges when reconstructing large viral genomes.

AVAILABILITY AND IMPLEMENTATION

BonoboFlow and corresponding containerized images are publicly available at https://github.com/nchis09/BonoboFlow and https://hub.docker.com/r/nchis09/bonobo_image. The test dataset is available at SRA repository Accession number: PRJNA1137155, http://www.ncbi.nlm.nih.gov/bioproject/1137155.

摘要

病毒基因组测序与分析对于理解病毒的多样性和进化至关重要。传统的桑格测序法受限于低序列深度且劳动强度大。新一代测序（NGS）方法，如Illumina，可提供更高的测序深度和通量，但由于基因组片段化，在准确重建病毒基因组方面面临挑战。第三代测序平台，如PacBio和牛津纳米孔技术（ONT），可产生高通量的长读长。然而，PacBio受到大量资源需求的限制，而ONT则存在固有高错误率的问题。此外，涵盖碱基识别到基因组组装的ONT测序标准化流程仍然有限。

结果

在此，我们介绍了BonoboFlow，这是一个标准化的Nextflow流程，旨在简化基于ONT的病毒基因组组装/单倍型重建。BonoboFlow整合了关键处理步骤，包括碱基识别、读段过滤、嵌合读段去除、错误校正、基因组草图组装/单倍型重建以及基因组优化。该流程接受原始的POD5或碱基识别后的FASTQ文件作为输入，生成FASTA一致性文件作为输出，并使用参考基因组（FASTA格式）进行污染读段过滤。通过Docker和Singularity对BonoboFlow进行容器化实现，确保了在不同计算环境中的无缝部署。虽然BonoboFlow在组装中小型病毒基因组方面表现出色，但在重建大型病毒基因组时显示出挑战。

可用性与实施

BonoboFlow及相应的容器化镜像可在https://github.com/nchis09/BonoboFlow和https://hub.docker.com/r/nchis09/bonobo_image上公开获取。测试数据集可在SRA存储库中获取，登录号：PRJNA1137155，http://www.ncbi.nlm.nih.gov/bioproject/1137155。

相似文献

BonoboFlow: viral genome assembly and haplotype reconstruction from nanopore reads.倭黑猩猩流程：基于纳米孔测序 reads 的病毒基因组组装与单倍型重建

Bioinform Adv. 2025 May 13;5(1):vbaf115. doi: 10.1093/bioadv/vbaf115. eCollection 2025.

Optimizing fungal DNA extraction and purification for Oxford Nanopore untargeted shotgun metagenomic sequencing from simulated hemoculture specimens.优化从模拟血液培养标本中进行牛津纳米孔非靶向鸟枪法宏基因组测序的真菌DNA提取和纯化方法。

mSystems. 2025 Jun 17;10(6):e0116624. doi: 10.1128/msystems.01166-24. Epub 2025 Apr 8.

Interventions for central serous chorioretinopathy: a network meta-analysis.中心性浆液性脉络膜视网膜病变的干预措施：一项网状Meta分析

Cochrane Database Syst Rev. 2025 Jun 16;6(6):CD011841. doi: 10.1002/14651858.CD011841.pub3.

MicroPIPE: validating an end-to-end workflow for high-quality complete bacterial genome construction.MicroPIPE：验证用于高质量完整细菌基因组构建的端到端工作流程。

BMC Genomics. 2021 Jun 25;22(1):474. doi: 10.1186/s12864-021-07767-z.

Comparison of ONT and CCS sequencing technologies on the polyploid genome of a medicinal plant showed that high error rate of ONT reads are not suitable for self-correction.对一种药用植物多倍体基因组上的纳米孔测序（ONT）技术和环形一致序列（CCS）测序技术进行比较后发现，ONT读数的高错误率不适用于自我校正。

Chin Med. 2022 Aug 9;17(1):94. doi: 10.1186/s13020-022-00644-1.

Aural toilet (ear cleaning) for chronic suppurative otitis media.慢性化脓性中耳炎的耳道清理（耳部清洁）

Cochrane Database Syst Rev. 2025 Jun 9;6(6):CD013057. doi: 10.1002/14651858.CD013057.pub3.

Genetic analysis using long-read sequencing to overcome the difficulties in gene.使用长读长测序进行基因分析以克服基因研究中的困难。

Res Pract Thromb Haemost. 2025 May 17;9(4):102888. doi: 10.1016/j.rpth.2025.102888. eCollection 2025 May.

Systemic antibiotics for chronic suppurative otitis media.用于慢性化脓性中耳炎的全身性抗生素

Cochrane Database Syst Rev. 2025 Jun 9;6(6):CD013052. doi: 10.1002/14651858.CD013052.pub3.

Daily standardization of routinely collected milk mid-infrared spectra from dairy herd improvement testing in a statistical framework.在统计框架下，对奶牛群改良检测中常规收集的牛奶中红外光谱进行每日标准化。

J Dairy Sci. 2025 Jul;108(7):7202-7223. doi: 10.3168/jds.2024-25482. Epub 2025 Apr 28.

Electronic cigarettes for smoking cessation.用于戒烟的电子烟。

Cochrane Database Syst Rev. 2025 Jan 29;1(1):CD010216. doi: 10.1002/14651858.CD010216.pub9.

引用本文的文献

Identification, functional analysis, and clinical applications of defective viral genomes.缺陷病毒基因组的鉴定、功能分析及临床应用

Front Microbiol. 2025 Jul 17;16:1642520. doi: 10.3389/fmicb.2025.1642520. eCollection 2025.

本文引用的文献

The usefulness of nanopore sequencing in whole-genome sequencing-based genotyping of and serovar Enteritidis.纳米孔测序在基于全基因组测序的和肠炎沙门氏菌血清型基因分型中的应用。

Microbiol Spectr. 2024 Jul 2;12(7):e0050924. doi: 10.1128/spectrum.00509-24. Epub 2024 May 29.

Evaluation of the accuracy of bacterial genome reconstruction with Oxford Nanopore R10.4.1 long-read-only sequencing.评估 Oxford Nanopore R10.4.1 长读长测序技术在细菌基因组重建中的准确性。

Microb Genom. 2024 May;10(5). doi: 10.1099/mgen.0.001246.

Advancements in long-read genome sequencing technologies and algorithms.长读长测序技术和算法的进展。

Genomics. 2024 May;116(3):110842. doi: 10.1016/j.ygeno.2024.110842. Epub 2024 Apr 11.

Closing the gap: Oxford Nanopore Technologies R10 sequencing allows comparable results to Illumina sequencing for SNP-based outbreak investigation of bacterial pathogens.缩小差距：牛津纳米孔技术 R10 测序能够与 Illumina 测序相媲美，可用于基于 SNP 的细菌病原体暴发调查。

J Clin Microbiol. 2024 May 8;62(5):e0157623. doi: 10.1128/jcm.01576-23. Epub 2024 Mar 5.

Do we still need Illumina sequencing data? Evaluating Oxford Nanopore Technologies R10.4.1 flow cells and the Rapid v14 library prep kit for Gram negative bacteria whole genome assemblies.我们是否仍然需要 Illumina 测序数据？评估 Oxford Nanopore Technologies R10.4.1 流动池和 Rapid v14 文库制备试剂盒用于革兰氏阴性菌全基因组组装。

Can J Microbiol. 2024 May 1;70(5):178-189. doi: 10.1139/cjm-2023-0175. Epub 2024 Feb 14.

WGS of a cluster of MDR Shigella sonnei utilizing Oxford Nanopore R10.4.1 long-read sequencing.对一组利用牛津纳米孔 R10.4.1 长读测序的多重耐药福氏志贺菌进行 WGS。

J Antimicrob Chemother. 2024 Jan 3;79(1):55-60. doi: 10.1093/jac/dkad346.

The newest Oxford Nanopore R10.4.1 full-length 16S rRNA sequencing enables the accurate resolution of species-level microbial community profiling.最新的牛津纳米孔 R10.4.1 全长 16S rRNA 测序可实现精确解析物种水平的微生物群落组成。

Appl Environ Microbiol. 2023 Oct 31;89(10):e0060523. doi: 10.1128/aem.00605-23. Epub 2023 Oct 6.

NanoPack2: population-scale evaluation of long-read sequencing data.NanoPack2：长读测序数据的大规模评估。

Bioinformatics. 2023 May 4;39(5). doi: 10.1093/bioinformatics/btad311.

Optimizing experimental design for genome sequencing and assembly with Oxford Nanopore Technologies.利用牛津纳米孔技术优化基因组测序与组装的实验设计

GigaByte. 2021 Jul 13;2021:gigabyte27. doi: 10.46471/gigabyte.27. eCollection 2021.

AccuVIR: an ACCUrate VIRal genome assembly tool for third-generation sequencing data.AccuVIR：一种用于第三代测序数据的 ACCUrate 病毒基因组组装工具。

Bioinformatics. 2023 Jan 1;39(1). doi: 10.1093/bioinformatics/btac827.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

倭黑猩猩流程：基于纳米孔测序 reads 的病毒基因组组装与单倍型重建

BonoboFlow: viral genome assembly and haplotype reconstruction from nanopore reads.

作者信息

机构信息

出版信息

SUMMARY

RESULTS

AVAILABILITY AND IMPLEMENTATION

摘要

结果

可用性与实施

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献