在没有宿主参考基因组的高通量测序数据中进行病毒检测。

Virus detection in high-throughput sequencing data without a reference genome of the host.

机构信息

Institute for Animal Breeding and Genetics, University of Veterinary Medicine Hannover, Foundation, Bünteweg 17p, Hannover 30559, Germany.

Research Center for Emerging Infections and Zoonoses, University of Veterinary Medicine Hannover, Foundation, Bünteweg 17, Hannover 30559, Germany.

出版信息

Infect Genet Evol. 2018 Dec;66:180-187. doi: 10.1016/j.meegid.2018.09.026. Epub 2018 Oct 3.

DOI:10.1016/j.meegid.2018.09.026

PMID:30292006

Abstract

Discovery of novel viruses in host samples is a multidisciplinary process which relies increasingly on next-generation sequencing (NGS) followed by computational analysis. A crucial step in this analysis is to separate host sequence reads from the sequence reads of the virus to be discovered. This becomes especially difficult if no reference genome of the host is available. Furthermore, if the total number of viral reads in a sample is low, de novo assembly of a virus which is a requirement for most existing pipelines is hard to realize. We present a new modular, computational pipeline for discovery of novel viruses in host samples. While existing pipelines rely on the availability of the hosts reference genome for filtering sequence reads, our new pipeline can also cope with cases for which no reference genome is available. As a further novelty of our method a decoy module is used to assess false classification rates in the discovery process. Additionally, viruses with a low read coverage can be identified and visually reviewed. We validate our pipeline on simulated data as well as two experimental samples with known virus content. For the experimental samples, we were able to reproduce the laboratory findings. Our newly developed pipeline is applicable for virus detection in a wide range of host species. The three modules we present can either be incorporated individually in other pipelines or be used as a stand-alone pipeline. We are the first to present a decoy approach within a virus detection pipeline that can be used to assess error rates so that the quality of the final result can be judged. We provide an implementation of our modules via Github. However, the principle of the modules can easily be re-implemented by other researchers.

摘要

在宿主样本中发现新病毒是一个多学科的过程，越来越依赖于下一代测序（NGS）和随后的计算分析。在这种分析中，一个关键步骤是将宿主序列读取与要发现的病毒的序列读取分离。如果没有宿主的参考基因组，这就变得特别困难。此外，如果样本中病毒读取的总数较低，那么大多数现有管道所要求的病毒从头组装就很难实现。我们提出了一种新的模块化、计算性的宿主样本中新型病毒发现的管道。虽然现有的管道依赖于宿主参考基因组的可用性来过滤序列读取，但我们的新管道也可以处理没有参考基因组的情况。作为我们方法的另一个新颖之处，使用诱饵模块来评估发现过程中的错误分类率。此外，还可以识别和直观地检查覆盖率低的病毒。我们在模拟数据和两个具有已知病毒含量的实验样本上验证了我们的管道。对于实验样本，我们能够重现实验室的发现。我们新开发的管道适用于多种宿主物种的病毒检测。我们提出的三个模块可以单独集成到其他管道中，也可以作为独立的管道使用。我们是第一个在病毒检测管道中提出诱饵方法的人，该方法可用于评估错误率，从而可以判断最终结果的质量。我们通过 Github 提供了我们模块的实现。然而，其他研究人员可以很容易地重新实现这些模块的原理。

相似文献

Virus detection in high-throughput sequencing data without a reference genome of the host.在没有宿主参考基因组的高通量测序数据中进行病毒检测。

Infect Genet Evol. 2018 Dec;66:180-187. doi: 10.1016/j.meegid.2018.09.026. Epub 2018 Oct 3.

ViraPipe: scalable parallel pipeline for viral metagenome analysis from next generation sequencing reads.ViraPipe：用于从下一代测序读取中进行病毒宏基因组分析的可扩展并行管道。

Bioinformatics. 2018 Mar 15;34(6):928-935. doi: 10.1093/bioinformatics/btx702.

Enhanced bioinformatic profiling of VIDISCA libraries for virus detection and discovery.增强 VIDISCA 文库的生物信息学分析，以用于病毒检测和发现。

Virus Res. 2019 Apr 2;263:21-26. doi: 10.1016/j.virusres.2018.12.010. Epub 2018 Dec 19.

Optimization and validation of sample preparation for metagenomic sequencing of viruses in clinical samples.优化和验证临床样本宏基因组测序中病毒样品制备。

Microbiome. 2017 Aug 8;5(1):94. doi: 10.1186/s40168-017-0317-z.

VIP: an integrated pipeline for metagenomics of virus identification and discovery.VIP：一种用于病毒鉴定和发现的宏基因组学综合流程。

Sci Rep. 2016 Mar 30;6:23774. doi: 10.1038/srep23774.

VirusSeeker, a computational pipeline for virus discovery and virome composition analysis.VirusSeeker，一种用于病毒发现和病毒群落组成分析的计算流程。

Virology. 2017 Mar;503:21-30. doi: 10.1016/j.virol.2017.01.005. Epub 2017 Jan 18.

drVM: a new tool for efficient genome assembly of known eukaryotic viruses from metagenomes.drVM：一种用于从宏基因组中高效组装已知真核病毒基因组的新工具。

Gigascience. 2017 Feb 1;6(2):1-10. doi: 10.1093/gigascience/gix003.

A Next-Generation Sequencing Data Analysis Pipeline for Detecting Unknown Pathogens from Mixed Clinical Samples and Revealing Their Genetic Diversity.一种用于从混合临床样本中检测未知病原体并揭示其遗传多样性的新一代测序数据分析流程。

PLoS One. 2016 Mar 17;11(3):e0151495. doi: 10.1371/journal.pone.0151495. eCollection 2016.

Comparing Viral Metagenomic Extraction Methods.比较病毒宏基因组提取方法。

Curr Issues Mol Biol. 2017;24:59-70. doi: 10.21775/cimb.024.059. Epub 2017 Jul 6.

Cataloguing the taxonomic origins of sequences from a heterogeneous sample using phylogenomics: applications in adventitious agent detection.利用系统发育基因组学对异质样本中序列的分类学起源进行编目：在检测外来因子中的应用。

PDA J Pharm Sci Technol. 2014 Nov-Dec;68(6):602-18. doi: 10.5731/pdajpst.2014.01023.

引用本文的文献

Interactive, Browser-Based Graphics to Visualize Complex Data in Education of Biomedical Sciences for Veterinary Students.用于兽医学生生物医学科学教育中可视化复杂数据的交互式、基于浏览器的图形

Med Sci Educ. 2022 Sep 22;32(6):1323-1335. doi: 10.1007/s40670-022-01613-x. eCollection 2022 Dec.

Clinical Application and Influencing Factor Analysis of Metagenomic Next-Generation Sequencing (mNGS) in ICU Patients With Sepsis.宏基因组下一代测序（mNGS）在 ICU 脓毒症患者中的临床应用及影响因素分析。

Front Cell Infect Microbiol. 2022 Jul 13;12:905132. doi: 10.3389/fcimb.2022.905132. eCollection 2022.

Heat Stress Resistance Mechanisms of Two Cucumber Varieties from Different Regions.两个不同地区黄瓜品种的耐热机制。

Int J Mol Sci. 2022 Feb 5;23(3):1817. doi: 10.3390/ijms23031817.

Correcting the Estimation of Viral Taxa Distributions in Next-Generation Sequencing Data after Applying Artificial Neural Networks.应用人工神经网络后校正下一代测序数据中病毒分类群分布的估计。

Genes (Basel). 2021 Oct 31;12(11):1755. doi: 10.3390/genes12111755.

2019 meeting of the global virus network.2019 年全球病毒网络会议。

Antiviral Res. 2019 Dec;172:104645. doi: 10.1016/j.antiviral.2019.104645. Epub 2019 Nov 4.

An evolutionary divergent pestivirus lacking the N gene systemically infects a whale species.一种进化上有差异的瘟病毒，缺乏 N 基因，系统感染一种鲸鱼物种。

Emerg Microbes Infect. 2019;8(1):1383-1392. doi: 10.1080/22221751.2019.1664940.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

在没有宿主参考基因组的高通量测序数据中进行病毒检测。

Virus detection in high-throughput sequencing data without a reference genome of the host.

机构信息

出版信息

相似文献

引用本文的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献