• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

用于全长病毒基因组组装的开源生物信息学流程的比较评估

Comparative Evaluation of Open-Source Bioinformatics Pipelines for Full-Length Viral Genome Assembly.

作者信息

Zsichla Levente, Zeeb Marius, Fazekas Dávid, Áy Éva, Müller Dalma, Metzner Karin J, Kouyos Roger D, Müller Viktor

机构信息

Institute of Biology, ELTE Eötvös Loránd University, 1117 Budapest, Hungary.

National Laboratory for Health Security, ELTE Eötvös Loránd University, 1117 Budapest, Hungary.

出版信息

Viruses. 2024 Nov 24;16(12):1824. doi: 10.3390/v16121824.

DOI:10.3390/v16121824
PMID:39772134
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11680378/
Abstract

The increasingly widespread application of next-generation sequencing (NGS) in clinical diagnostics and epidemiological research has generated a demand for robust, fast, automated, and user-friendly bioinformatics workflows. To guide the choice of tools for the assembly of full-length viral genomes from NGS datasets, we assessed the performance and applicability of four open-source bioinformatics pipelines (shiver-for which we created a user-friendly Dockerized version, referred to as dshiver; SmaltAlign; viral-ngs; and V-pipe) using both simulated and real-world HIV-1 paired-end short-read datasets and default settings. All four pipelines produced consensus genome assemblies with high quality metrics (genome fraction recovery, mismatch and indel rates, variant calling F1 scores) when the reference sequence used for assembly had high similarity to the analyzed sample. The shiver and SmaltAlign pipelines (but not viral-ngs and V-Pipe) also showed robust performance with more divergent samples (non-matching subtypes). With empirical datasets, SmaltAlign and viral-ngs exhibited an order of magnitude shorter runtime compared to V-Pipe and shiver. In terms of applicability, V-Pipe provides the broadest functionalities, SmaltAlign and dshiver combine user-friendliness with robustness, while the use of viral-ngs requires less computational resources compared to other pipelines. In conclusion, if a closely matched reference sequence is available, all pipelines can reliably reconstruct viral consensus genomes; therefore, differences in user-friendliness and runtime may guide the choice of the pipeline in a particular setting. If a matched reference sequence cannot be selected, we recommend shiver or SmaltAlign for robust performance. The new Dockerized version of shiver offers ease of use in addition to the accuracy and robustness of the original pipeline.

摘要

下一代测序(NGS)在临床诊断和流行病学研究中的应用日益广泛,这就产生了对强大、快速、自动化且用户友好的生物信息学工作流程的需求。为了指导从NGS数据集中组装全长病毒基因组的工具选择,我们使用模拟和真实世界的HIV-1双端短读数据集及默认设置,评估了四种开源生物信息学流程(shiver,我们为其创建了一个用户友好的Docker化版本,称为dshiver;SmaltAlign;viral-ngs;以及V-pipe)的性能和适用性。当用于组装的参考序列与分析样本具有高度相似性时,所有这四种流程都生成了具有高质量指标(基因组片段回收率、错配和插入缺失率、变异调用F1分数)的一致性基因组组装。shiver和SmaltAlign流程(但不包括viral-ngs和V-Pipe)在处理差异更大的样本(不匹配的亚型)时也表现出强大的性能。对于实证数据集,与V-Pipe和shiver相比,SmaltAlign和viral-ngs的运行时间短了一个数量级。在适用性方面,V-Pipe提供了最广泛的功能,SmaltAlign和dshiver将用户友好性与稳健性相结合,而与其他流程相比,viral-ngs的使用需要更少的计算资源。总之,如果有密切匹配的参考序列可用,所有流程都可以可靠地重建病毒一致性基因组;因此,用户友好性和运行时间的差异可能会指导在特定情况下流程的选择。如果无法选择匹配的参考序列,我们建议使用shiver或SmaltAlign以获得强大的性能。shiver的新Docker化版本除了具有原始流程的准确性和稳健性之外,还提供了易用性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/62cb/11680378/f4dd787b29c4/viruses-16-01824-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/62cb/11680378/0473b6d3d0d3/viruses-16-01824-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/62cb/11680378/9607f454812e/viruses-16-01824-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/62cb/11680378/a4a4f8af23f2/viruses-16-01824-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/62cb/11680378/35e024262b56/viruses-16-01824-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/62cb/11680378/2cb8ff8bc301/viruses-16-01824-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/62cb/11680378/c827e682e9bf/viruses-16-01824-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/62cb/11680378/a593fd97b060/viruses-16-01824-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/62cb/11680378/f4dd787b29c4/viruses-16-01824-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/62cb/11680378/0473b6d3d0d3/viruses-16-01824-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/62cb/11680378/9607f454812e/viruses-16-01824-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/62cb/11680378/a4a4f8af23f2/viruses-16-01824-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/62cb/11680378/35e024262b56/viruses-16-01824-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/62cb/11680378/2cb8ff8bc301/viruses-16-01824-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/62cb/11680378/c827e682e9bf/viruses-16-01824-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/62cb/11680378/a593fd97b060/viruses-16-01824-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/62cb/11680378/f4dd787b29c4/viruses-16-01824-g008.jpg

相似文献

1
Comparative Evaluation of Open-Source Bioinformatics Pipelines for Full-Length Viral Genome Assembly.用于全长病毒基因组组装的开源生物信息学流程的比较评估
Viruses. 2024 Nov 24;16(12):1824. doi: 10.3390/v16121824.
2
Validation of Variant Assembly Using HAPHPIPE with Next-Generation Sequence Data from Viruses.使用 HAPHPIPE 对病毒的下一代序列数据进行变体组装验证。
Viruses. 2020 Jul 14;12(7):758. doi: 10.3390/v12070758.
3
NGS_SNPAnalyzer: a desktop software supporting genome projects by identifying and visualizing sequence variations from next-generation sequencing data.NGS_SNPAnalyzer:一款桌面软件,通过识别和可视化来自下一代测序数据的序列变异,支持基因组项目。
Genes Genomics. 2020 Nov;42(11):1311-1317. doi: 10.1007/s13258-020-00997-7. Epub 2020 Sep 26.
4
Bioinformatic data processing pipelines in support of next-generation sequencing-based HIV drug resistance testing: the Winnipeg Consensus.支持基于下一代测序的 HIV 耐药性检测的生物信息学数据处理管道:温尼伯共识。
J Int AIDS Soc. 2018 Oct;21(10):e25193. doi: 10.1002/jia2.25193.
5
Easing genomic surveillance: A comprehensive performance evaluation of long-read assemblers across multi-strain mixture data of HIV-1 and Other pathogenic viruses for constructing a user-friendly bioinformatic pipeline.简化基因组监测:针对 HIV-1 和其他病原性病毒的多菌株混合数据,对长读长组装器进行全面性能评估,以构建用户友好的生物信息学管道。
F1000Res. 2024 May 31;13:556. doi: 10.12688/f1000research.149577.1. eCollection 2024.
6
drVM: a new tool for efficient genome assembly of known eukaryotic viruses from metagenomes.drVM:一种用于从宏基因组中高效组装已知真核病毒基因组的新工具。
Gigascience. 2017 Feb 1;6(2):1-10. doi: 10.1093/gigascience/gix003.
7
VGEA: an RNA viral assembly toolkit.VGEA:一种RNA病毒组装工具包。
PeerJ. 2021 Sep 6;9:e12129. doi: 10.7717/peerj.12129. eCollection 2021.
8
UTAP2: an enhanced user-friendly transcriptome and epigenome analysis pipeline.UTAP2:一种增强型的用户友好型转录组和表观基因组分析流程。
BMC Bioinformatics. 2025 Mar 7;26(1):79. doi: 10.1186/s12859-025-06090-8.
9
INSaFLU-TELEVIR: an open web-based bioinformatics suite for viral metagenomic detection and routine genomic surveillance.INSaFLU-TELEVIR:一个基于网络的开放式生物信息学套件,用于病毒宏基因组检测和常规基因组监测。
Genome Med. 2024 Apr 25;16(1):61. doi: 10.1186/s13073-024-01334-3.
10
MethylStar: A fast and robust pre-processing pipeline for bulk or single-cell whole-genome bisulfite sequencing data.MethylStar:一个快速且稳健的用于批量或单细胞全基因组 bisulfite 测序数据的预处理流水线。
BMC Genomics. 2020 Jul 13;21(1):479. doi: 10.1186/s12864-020-06886-3.

引用本文的文献

1
Automated Annotation and Validation of Human Respiratory Virus Sequences using VADR.使用VADR对人类呼吸道病毒序列进行自动注释和验证
bioRxiv. 2025 Aug 11:2025.08.07.669219. doi: 10.1101/2025.08.07.669219.
2
Bridging Genomics and Clinical Medicine: RSVrecon Enhances RSV Surveillance with Automated Genotyping and Clinically-important Mutation Reporting.连接基因组学与临床医学:RSVrecon 通过自动化基因分型和临床重要突变报告加强呼吸道合胞病毒监测。
bioRxiv. 2025 Jun 9:2025.06.03.657184. doi: 10.1101/2025.06.03.657184.
3
Phylogenetics and molecular evolution to understand and curb the HIV pandemic.

本文引用的文献

1
Longitudinal population-level HIV epidemiologic and genomic surveillance highlights growing gender disparity of HIV transmission in Uganda.纵向人群 HIV 流行病学和基因组监测突出了乌干达 HIV 传播中性别差距日益扩大的问题。
Nat Microbiol. 2024 Jan;9(1):35-54. doi: 10.1038/s41564-023-01530-8. Epub 2023 Dec 5.
2
Frequency matters: comparison of drug resistance mutation detection by Sanger and next-generation sequencing in HIV-1.频率很重要:Sanger 和下一代测序检测 HIV-1 耐药突变的比较。
J Antimicrob Chemother. 2023 Mar 2;78(3):656-664. doi: 10.1093/jac/dkac430.
3
Added Value of Next Generation over Sanger Sequencing in Kenyan Youth with Extensive HIV-1 Drug Resistance.
利用系统发育学和分子进化来理解和遏制艾滋病大流行。
Nat Rev Microbiol. 2025 Jun 30. doi: 10.1038/s41579-025-01202-w.
4
Addressing data management and analysis challenges in viral genomics: The Swiss HIV cohort study viral next generation sequencing database.应对病毒基因组学中的数据管理与分析挑战:瑞士HIV队列研究病毒下一代测序数据库
PLOS Digit Health. 2025 Apr 21;4(4):e0000825. doi: 10.1371/journal.pdig.0000825. eCollection 2025 Apr.
肯尼亚广泛耐药的 HIV-1 青年中下一代测序比桑格测序的附加值。
Microbiol Spectr. 2022 Dec 21;10(6):e0345422. doi: 10.1128/spectrum.03454-22. Epub 2022 Nov 29.
4
Comparative HIV-1 Phylogenies Characterized by and Near-Full-Length Genome Sequences.基于 和 全长基因组序列对 HIV-1 系统发育进行比较。
Viruses. 2022 Oct 17;14(10):2286. doi: 10.3390/v14102286.
5
Technologies for HIV-1 drug resistance testing: inventory and needs.HIV-1 耐药性检测技术:库存和需求。
Curr Opin HIV AIDS. 2022 Jul 1;17(4):222-228. doi: 10.1097/COH.0000000000000737.
6
Benchmarking and Assessment of Eight Genome Assemblers on Viral Next-Generation Sequencing Data, Including the SARS-CoV-2.对包括 SARS-CoV-2 在内的病毒下一代测序数据的八种基因组组装器的基准测试和评估。
OMICS. 2022 Jul;26(7):372-381. doi: 10.1089/omi.2022.0042. Epub 2022 Jun 28.
7
A Systematic Molecular Epidemiology Screen Reveals Numerous Human Immunodeficiency Virus (HIV) Type 1 Superinfections in the Swiss HIV Cohort Study.一项系统性分子流行病学筛查在瑞士艾滋病队列研究中发现了大量1型人类免疫缺陷病毒(HIV)重复感染病例。
J Infect Dis. 2022 Sep 28;226(7):1256-1266. doi: 10.1093/infdis/jiac166.
8
Deep-sequence phylogenetics to quantify patterns of HIV transmission in the context of a universal testing and treatment trial - BCPP/Ya Tsie trial.深度测序系统发生学在一项普遍检测和治疗试验背景下定量评估 HIV 传播模式:BCPP/Ya Tsie 试验。
Elife. 2022 Mar 1;11:e72657. doi: 10.7554/eLife.72657.
9
Updated HIV-1 Consensus Sequences Change but Stay Within Similar Distance From Worldwide Samples.更新后的HIV-1共识序列发生变化,但与全球样本的距离仍在相似范围内。
Front Microbiol. 2022 Jan 31;12:828765. doi: 10.3389/fmicb.2021.828765. eCollection 2021.
10
Advanced sequencing approaches detected insertions of viral and human origin in the viral genome of chronic hepatitis E virus patients.先进的测序方法在慢性戊型肝炎病毒患者的病毒基因组中检测到了病毒和人类来源的插入。
Sci Rep. 2022 Feb 2;12(1):1720. doi: 10.1038/s41598-022-05706-w.