Suppr超能文献

非靶向病毒序列发现流程和宏基因组数据的病毒聚类

Nontargeted virus sequence discovery pipeline and virus clustering for metagenomic data.

机构信息

Joint Genome Institute, Department of Energy, Walnut Creek, California, USA.

出版信息

Nat Protoc. 2017 Aug;12(8):1673-1682. doi: 10.1038/nprot.2017.063. Epub 2017 Jul 27.

Abstract

The analysis of large microbiome data sets holds great promise for the delineation of the biological and metabolic functioning of living organisms and their role in the environment. In the midst of this genomic puzzle, viruses, especially those that infect microbial communities, represent a major reservoir of genetic diversity with great impact on biogeochemical cycles and organismal health. Overcoming the limitations associated with virus detection directly from microbiomes can provide key insights into how ecosystem dynamics are modulated. Here, we present a computational protocol for accurate detection and grouping of viral sequences from microbiome samples. Our approach relies on an expanded and curated set of viral protein families used as bait to identify viral sequences directly from metagenomic assemblies. This protocol describes how to use the viral protein families catalog (∼7 h) and recommended filters for the detection of viral contigs in metagenomic samples (∼6 h), and it describes the specific parameters for a nucleotide-sequence-identity-based method of organizing the viral sequences into quasi-species taxonomic-level groups (∼10 min).

摘要

对大型微生物组数据集的分析为描绘生物体的生物学和代谢功能及其在环境中的作用提供了巨大的前景。在这个基因组的谜团中,病毒,特别是那些感染微生物群落的病毒,是遗传多样性的主要储存库,对生物地球化学循环和生物体健康有着巨大的影响。克服直接从微生物组中检测病毒的局限性,可以深入了解生态系统动态是如何被调节的。在这里,我们提出了一种从微生物组样本中准确检测和分组病毒序列的计算方案。我们的方法依赖于一个扩展和精心整理的病毒蛋白家族集,用作诱饵来直接从宏基因组组装中识别病毒序列。本方案描述了如何使用病毒蛋白家族目录(约 7 小时)和推荐的过滤器来检测宏基因组样本中的病毒连续体(约 6 小时),并描述了基于核苷酸序列同一性的方法将病毒序列组织成准种分类水平群的具体参数(约 10 分钟)。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验