Eagle Global Scientific LLC, Atlanta, Georgia, USA.
Division of Viral Diseases, National Center for Immunization and Respiratory Diseases, Centers for Disease Control and Preventiongrid.416738.f, Atlanta, Georgia, USA.
Microbiol Spectr. 2022 Apr 27;10(2):e0256421. doi: 10.1128/spectrum.02564-21. Epub 2022 Mar 2.
Next-generation sequencing (NGS) is a powerful tool for detecting and investigating viral pathogens; however, analysis and management of the enormous amounts of data generated from these technologies remains a challenge. Here, we present VPipe (the Viral NGS Analysis Pipeline and Data Management System), an automated bioinformatics pipeline optimized for whole-genome assembly of viral sequences and identification of diverse species. VPipe automates the data quality control, assembly, and contig identification steps typically performed when analyzing NGS data. Users access the pipeline through a secure web-based portal, which provides an easy-to-use interface with advanced search capabilities for reviewing results. In addition, VPipe provides a centralized system for storing and analyzing NGS data, eliminating common bottlenecks in bioinformatics analyses for public health laboratories with limited on-site computational infrastructure. The performance of VPipe was validated through the analysis of publicly available NGS data sets for viral pathogens, generating high-quality assemblies for 12 data sets. VPipe also generated assemblies with greater contiguity than similar pipelines for 41 human respiratory syncytial virus isolates and 23 SARS-CoV-2 specimens. Computational infrastructure and bioinformatics analysis are bottlenecks in the application of NGS to viral pathogens. As of September 2021, VPipe has been used by the U.S. Centers for Disease Control and Prevention (CDC) and 12 state public health laboratories to characterize >17,500 and 1,500 clinical specimens and isolates, respectively. VPipe automates genome assembly for a wide range of viruses, including high-consequence pathogens such as SARS-CoV-2. Such automated functionality expedites public health responses to viral outbreaks and pathogen surveillance.
下一代测序 (NGS) 是检测和研究病毒病原体的有力工具;然而,分析和管理这些技术产生的大量数据仍然是一个挑战。在这里,我们介绍了 VPipe(病毒 NGS 分析管道和数据管理系统),这是一个自动化的生物信息学管道,针对病毒序列的全基因组组装和多种物种的鉴定进行了优化。VPipe 自动执行了在分析 NGS 数据时通常执行的数据质量控制、组装和拼接识别步骤。用户通过安全的基于网络的门户访问该管道,该门户提供了一个易于使用的界面,具有高级搜索功能,可用于查看结果。此外,VPipe 提供了一个集中式系统,用于存储和分析 NGS 数据,消除了具有有限现场计算基础设施的公共卫生实验室中生物信息学分析的常见瓶颈。通过分析公开可用的病毒病原体 NGS 数据集来验证 VPipe 的性能,为 12 个数据集生成了高质量的组装。VPipe 还为 41 个人呼吸道合胞病毒分离株和 23 个 SARS-CoV-2 样本生成了比类似管道具有更高连续性的组装。计算基础设施和生物信息学分析是将 NGS 应用于病毒病原体的瓶颈。截至 2021 年 9 月,美国疾病控制与预防中心 (CDC) 和 12 个州公共卫生实验室已使用 VPipe 分别对超过 17500 份和 1500 份临床标本和分离物进行了特征分析。VPipe 为广泛的病毒自动组装基因组,包括 SARS-CoV-2 等高后果病原体。这种自动化功能加快了公共卫生对病毒爆发和病原体监测的反应。