• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

V-pipe:一种用于从高通量数据评估病毒遗传多样性的计算流程。

V-pipe: a computational pipeline for assessing viral genetic diversity from high-throughput data.

作者信息

Posada-Céspedes Susana, Seifert David, Topolsky Ivan, Jablonski Kim Philipp, Metzner Karin J, Beerenwinkel Niko

机构信息

Department of Biosystems Science and Engineering, ETH Zurich, 4058 Basel, Switzerland.

SIB Swiss Institute of Bioinformatics, 4058 Basel, Switzerland.

出版信息

Bioinformatics. 2021 Jul 19;37(12):1673-1680. doi: 10.1093/bioinformatics/btab015.

DOI:10.1093/bioinformatics/btab015
PMID:33471068
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8289377/
Abstract

MOTIVATION

High-throughput sequencing technologies are used increasingly not only in viral genomics research but also in clinical surveillance and diagnostics. These technologies facilitate the assessment of the genetic diversity in intra-host virus populations, which affects transmission, virulence and pathogenesis of viral infections. However, there are two major challenges in analysing viral diversity. First, amplification and sequencing errors confound the identification of true biological variants, and second, the large data volumes represent computational limitations.

RESULTS

To support viral high-throughput sequencing studies, we developed V-pipe, a bioinformatics pipeline combining various state-of-the-art statistical models and computational tools for automated end-to-end analyses of raw sequencing reads. V-pipe supports quality control, read mapping and alignment, low-frequency mutation calling, and inference of viral haplotypes. For generating high-quality read alignments, we developed a novel method, called ngshmmalign, based on profile hidden Markov models and tailored to small and highly diverse viral genomes. V-pipe also includes benchmarking functionality providing a standardized environment for comparative evaluations of different pipeline configurations. We demonstrate this capability by assessing the impact of three different read aligners (Bowtie 2, BWA MEM, ngshmmalign) and two different variant callers (LoFreq, ShoRAH) on the performance of calling single-nucleotide variants in intra-host virus populations. V-pipe supports various pipeline configurations and is implemented in a modular fashion to facilitate adaptations to the continuously changing technology landscape.

AVAILABILITYAND IMPLEMENTATION

V-pipe is freely available at https://github.com/cbg-ethz/V-pipe.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

高通量测序技术不仅越来越多地用于病毒基因组学研究,还用于临床监测和诊断。这些技术有助于评估宿主内病毒群体的遗传多样性,而这种多样性会影响病毒感染的传播、毒力和发病机制。然而,在分析病毒多样性方面存在两个主要挑战。第一,扩增和测序错误会混淆对真正生物学变异的识别;第二,大量数据带来了计算上的限制。

结果

为了支持病毒高通量测序研究,我们开发了V-pipe,这是一个生物信息学流程,它结合了各种最先进的统计模型和计算工具,用于对原始测序读数进行自动化的端到端分析。V-pipe支持质量控制、读数映射与比对、低频突变检测以及病毒单倍型推断。为了生成高质量的读数比对结果,我们基于轮廓隐马尔可夫模型开发了一种名为ngshmmalign的新方法,该方法专门针对小型且高度多样化的病毒基因组。V-pipe还包括基准测试功能,为不同流程配置的比较评估提供标准化环境。我们通过评估三种不同的读数比对器(Bowtie 2、BWA MEM、ngshmmalign)和两种不同的变异检测工具(LoFreq、ShoRAH)对宿主内病毒群体中单核苷酸变异检测性能的影响来展示这种能力。V-pipe支持各种流程配置,并以模块化方式实现,以便于适应不断变化的技术环境。

可用性与实现方式

V-pipe可在https://github.com/cbg-ethz/V-pipe上免费获取。

补充信息

补充数据可在《生物信息学》在线版获取。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/429a/8289377/f766f166b575/btab015f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/429a/8289377/4abe90af446b/btab015f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/429a/8289377/adfac1ee7c07/btab015f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/429a/8289377/381bfc607f8d/btab015f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/429a/8289377/f766f166b575/btab015f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/429a/8289377/4abe90af446b/btab015f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/429a/8289377/adfac1ee7c07/btab015f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/429a/8289377/381bfc607f8d/btab015f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/429a/8289377/f766f166b575/btab015f4.jpg

相似文献

1
V-pipe: a computational pipeline for assessing viral genetic diversity from high-throughput data.V-pipe:一种用于从高通量数据评估病毒遗传多样性的计算流程。
Bioinformatics. 2021 Jul 19;37(12):1673-1680. doi: 10.1093/bioinformatics/btab015.
2
V-pipe 3.0: a sustainable pipeline for within-sample viral genetic diversity estimation.V-pipe 3.0:一种用于样本内病毒遗传多样性估计的可持续性管道。
Gigascience. 2024 Jan 2;13. doi: 10.1093/gigascience/giae065.
3
Evaluation of haplotype callers for next-generation sequencing of viruses.病毒下一代测序中单体型caller 的评估。
Infect Genet Evol. 2020 Aug;82:104277. doi: 10.1016/j.meegid.2020.104277. Epub 2020 Mar 6.
4
VGEA: an RNA viral assembly toolkit.VGEA:一种RNA病毒组装工具包。
PeerJ. 2021 Sep 6;9:e12129. doi: 10.7717/peerj.12129. eCollection 2021.
5
Vargas: heuristic-free alignment for assessing linear and graph read aligners.瓦尔加斯:用于评估线性和图形读取对齐程序的无启发式对齐。
Bioinformatics. 2020 Jun 1;36(12):3712-3718. doi: 10.1093/bioinformatics/btaa265.
6
Measurements of Intrahost Viral Diversity Are Extremely Sensitive to Systematic Errors in Variant Calling.宿主内病毒多样性的测量对变异位点检测中的系统误差极为敏感。
J Virol. 2016 Jul 11;90(15):6884-95. doi: 10.1128/JVI.00667-16. Print 2016 Aug 1.
7
Benchmarking variant identification tools for plant diversity discovery.植物多样性发现的变异识别工具基准测试。
BMC Genomics. 2019 Sep 9;20(1):701. doi: 10.1186/s12864-019-6057-7.
8
Measurement error and variant-calling in deep Illumina sequencing of HIV.Illumina 高通量测序中 HIV 的测量误差和变异调用。
Bioinformatics. 2019 Jun 1;35(12):2029-2035. doi: 10.1093/bioinformatics/bty919.
9
MutAid: Sanger and NGS Based Integrated Pipeline for Mutation Identification, Validation and Annotation in Human Molecular Genetics.MutAid:基于桑格测序法和新一代测序技术的综合流程,用于人类分子遗传学中的突变鉴定、验证及注释
PLoS One. 2016 Feb 3;11(2):e0147697. doi: 10.1371/journal.pone.0147697. eCollection 2016.
10
MVP: a modular viromics pipeline to identify, filter, cluster, annotate, and bin viruses from metagenomes.MVP:一个模块化的病毒组学分析流程,用于从宏基因组中识别、过滤、聚类、注释和分类病毒。
mSystems. 2024 Oct 22;9(10):e0088824. doi: 10.1128/msystems.00888-24. Epub 2024 Oct 1.

引用本文的文献

1
Early detection of emerging SARS-CoV-2 Variants from wastewater through genome sequencing and machine learning.通过基因组测序和机器学习从废水中早期检测新出现的严重急性呼吸综合征冠状病毒2变体。
Nat Commun. 2025 Jul 8;16(1):6272. doi: 10.1038/s41467-025-61280-5.
2
RLSuccSite: succinylation sites prediction based on reinforcement learning dynamic with balanced reward mechanism and three-peaks enhanced method for physicochemical property scores.RLSuccSite:基于具有平衡奖励机制的强化学习动态和物理化学性质分数的三峰增强方法的琥珀酰化位点预测
J Cheminform. 2025 Jun 2;17(1):92. doi: 10.1186/s13321-025-01034-z.
3
HPV-KITE: sequence analysis software for rapid HPV genotype detection.
HPV-KITE:用于快速检测人乳头瘤病毒基因型的序列分析软件。
Brief Bioinform. 2025 Mar 4;26(2). doi: 10.1093/bib/bbaf155.
4
A method for in-depth analysis of circular DNA virus populations by unambiguously profiling the low abundant virus variants and partial genomic components.一种通过明确分析低丰度病毒变体和部分基因组成分来深入分析环状DNA病毒群体的方法。
Nucleic Acids Res. 2025 Mar 20;53(6). doi: 10.1093/nar/gkaf221.
5
VITALdb: to select the best viroinformatics tools for a desired virus or application.VITALdb:为所需病毒或应用选择最佳的病毒信息学工具。
Brief Bioinform. 2025 Mar 4;26(2). doi: 10.1093/bib/bbaf084.
6
Comparative Evaluation of Open-Source Bioinformatics Pipelines for Full-Length Viral Genome Assembly.用于全长病毒基因组组装的开源生物信息学流程的比较评估
Viruses. 2024 Nov 24;16(12):1824. doi: 10.3390/v16121824.
7
SARS-CoV-2 Illumina GeNome Assembly Line (SIGNAL), a Snakemate workflow for rapid and bulk analysis of Illumina sequencing of SARS-CoV-2 genomes.严重急性呼吸综合征冠状病毒2型Illumina基因组装配线(SIGNAL),一种用于快速批量分析严重急性呼吸综合征冠状病毒2型基因组Illumina测序的Snakemate工作流程。
NAR Genom Bioinform. 2024 Dec 18;6(4):lqae176. doi: 10.1093/nargab/lqae176. eCollection 2024 Dec.
8
V-pipe 3.0: a sustainable pipeline for within-sample viral genetic diversity estimation.V-pipe 3.0:一种用于样本内病毒遗传多样性估计的可持续性管道。
Gigascience. 2024 Jan 2;13. doi: 10.1093/gigascience/giae065.
9
Transforming Clinical Research: The Power of High-Throughput Omics Integration.变革临床研究:高通量组学整合的力量
Proteomes. 2024 Sep 6;12(3):25. doi: 10.3390/proteomes12030025.
10
Impact of reference design on estimating SARS-CoV-2 lineage abundances from wastewater sequencing data.参考设计对从废水测序数据估计新冠病毒谱系丰度的影响。
Gigascience. 2024 Jan 2;13. doi: 10.1093/gigascience/giae051.