非靶向病毒序列发现流程和宏基因组数据的病毒聚类

Nontargeted virus sequence discovery pipeline and virus clustering for metagenomic data.

机构信息

Joint Genome Institute, Department of Energy, Walnut Creek, California, USA.

出版信息

Nat Protoc. 2017 Aug;12(8):1673-1682. doi: 10.1038/nprot.2017.063. Epub 2017 Jul 27.

DOI:10.1038/nprot.2017.063

Abstract

The analysis of large microbiome data sets holds great promise for the delineation of the biological and metabolic functioning of living organisms and their role in the environment. In the midst of this genomic puzzle, viruses, especially those that infect microbial communities, represent a major reservoir of genetic diversity with great impact on biogeochemical cycles and organismal health. Overcoming the limitations associated with virus detection directly from microbiomes can provide key insights into how ecosystem dynamics are modulated. Here, we present a computational protocol for accurate detection and grouping of viral sequences from microbiome samples. Our approach relies on an expanded and curated set of viral protein families used as bait to identify viral sequences directly from metagenomic assemblies. This protocol describes how to use the viral protein families catalog (∼7 h) and recommended filters for the detection of viral contigs in metagenomic samples (∼6 h), and it describes the specific parameters for a nucleotide-sequence-identity-based method of organizing the viral sequences into quasi-species taxonomic-level groups (∼10 min).

摘要

对大型微生物组数据集的分析为描绘生物体的生物学和代谢功能及其在环境中的作用提供了巨大的前景。在这个基因组的谜团中，病毒，特别是那些感染微生物群落的病毒，是遗传多样性的主要储存库，对生物地球化学循环和生物体健康有着巨大的影响。克服直接从微生物组中检测病毒的局限性，可以深入了解生态系统动态是如何被调节的。在这里，我们提出了一种从微生物组样本中准确检测和分组病毒序列的计算方案。我们的方法依赖于一个扩展和精心整理的病毒蛋白家族集，用作诱饵来直接从宏基因组组装中识别病毒序列。本方案描述了如何使用病毒蛋白家族目录（约 7 小时）和推荐的过滤器来检测宏基因组样本中的病毒连续体（约 6 小时），并描述了基于核苷酸序列同一性的方法将病毒序列组织成准种分类水平群的具体参数（约 10 分钟）。

相似文献

Nontargeted virus sequence discovery pipeline and virus clustering for metagenomic data.非靶向病毒序列发现流程和宏基因组数据的病毒聚类

Nat Protoc. 2017 Aug;12(8):1673-1682. doi: 10.1038/nprot.2017.063. Epub 2017 Jul 27.

Assessing viral taxonomic composition in benthic marine ecosystems: reliability and efficiency of different bioinformatic tools for viral metagenomic analyses.评估底栖海洋生态系统中的病毒分类组成：病毒宏基因组分析中不同生物信息学工具的可靠性和效率。

Sci Rep. 2016 Jun 22;6:28428. doi: 10.1038/srep28428.

Origins and challenges of viral dark matter.病毒暗物质的起源和挑战。

Virus Res. 2017 Jul 15;239:136-142. doi: 10.1016/j.virusres.2017.02.002. Epub 2017 Feb 9.

Cataloguing the taxonomic origins of sequences from a heterogeneous sample using phylogenomics: applications in adventitious agent detection.利用系统发育基因组学对异质样本中序列的分类学起源进行编目：在检测外来因子中的应用。

PDA J Pharm Sci Technol. 2014 Nov-Dec;68(6):602-18. doi: 10.5731/pdajpst.2014.01023.

Enhanced bioinformatic profiling of VIDISCA libraries for virus detection and discovery.增强 VIDISCA 文库的生物信息学分析，以用于病毒检测和发现。

Virus Res. 2019 Apr 2;263:21-26. doi: 10.1016/j.virusres.2018.12.010. Epub 2018 Dec 19.

ViromeScan: a new tool for metagenomic viral community profiling.病毒组扫描：一种用于宏基因组病毒群落分析的新工具。

BMC Genomics. 2016 Mar 1;17:165. doi: 10.1186/s12864-016-2446-3.

Increase in taxonomic assignment efficiency of viral reads in metagenomic studies.提高宏基因组研究中病毒读段分类学赋值效率。

Virus Res. 2018 Jan 15;244:230-234. doi: 10.1016/j.virusres.2017.11.011. Epub 2017 Nov 14.

Metagenomic approaches for direct and cell culture evaluation of the virological quality of wastewater.用于直接和细胞培养评估废水病毒学质量的宏基因组学方法。

J Virol Methods. 2014 Dec 15;210:15-21. doi: 10.1016/j.jviromet.2014.09.017. Epub 2014 Sep 28.

MiCoP: microbial community profiling method for detecting viral and fungal organisms in metagenomic samples.MiCoP：一种用于检测宏基因组样本中病毒和真菌生物的微生物群落分析方法。

BMC Genomics. 2019 Jun 6;20(Suppl 5):423. doi: 10.1186/s12864-019-5699-9.

Accurate binning of metagenomic contigs via automated clustering sequences using information of genomic signatures and marker genes.通过利用基因组特征和标记基因信息对序列进行自动聚类，实现宏基因组重叠群的精确分类。

Sci Rep. 2016 Apr 12;6:24175. doi: 10.1038/srep24175.

引用本文的文献

Unveiling the unknown viral world in groundwater.揭开地下水中未知的病毒世界。

Nat Commun. 2024 Aug 8;15(1):6788. doi: 10.1038/s41467-024-51230-y.

The effects of sequencing strategies on Metagenomic pathogen detection using bronchoalveolar lavage fluid samples.测序策略对使用支气管肺泡灌洗 fluid 样本进行宏基因组病原体检测的影响。（注：原英文中“bronchoalveolar lavage fluid”的“fluid”未完整翻译，推测可能是“液体”，完整准确的应该是“支气管肺泡灌洗液体” ）

Heliyon. 2024 Jun 22;10(13):e33429. doi: 10.1016/j.heliyon.2024.e33429. eCollection 2024 Jul 15.

Exploring the roles of ribosomal peptides in prokaryote-phage interactions through deep learning-enabled metagenome mining.通过深度学习赋能的宏基因组挖掘探索核糖体肽在原核生物-噬菌体相互作用中的作用。

Microbiome. 2024 May 24;12(1):94. doi: 10.1186/s40168-024-01807-y.

Macroalgal virosphere assists with host-microbiome equilibrium regulation and affects prokaryotes in surrounding marine environments.大型藻类病毒圈辅助宿主微生物组平衡调节，并影响周围海洋环境中的原核生物。

ISME J. 2024 Jan 8;18(1). doi: 10.1093/ismejo/wrae083.

Hidden diversity and potential ecological function of phosphorus acquisition genes in widespread terrestrial bacteriophages.广泛分布的陆地噬菌体中磷获取基因的隐藏多样性和潜在生态功能。

Nat Commun. 2024 Apr 2;15(1):2827. doi: 10.1038/s41467-024-47214-7.

The gut ileal mucosal virome is disturbed in patients with Crohn's disease and exacerbates intestinal inflammation in mice.肠回肠黏膜病毒组在克罗恩病患者中受到干扰，并在小鼠中加剧肠道炎症。

Nat Commun. 2024 Feb 22;15(1):1638. doi: 10.1038/s41467-024-45794-y.

Identification of over ten thousand candidate structured RNAs in viruses and phages.在病毒和噬菌体中鉴定出一万多种候选结构化RNA。

Comput Struct Biotechnol J. 2023 Nov 7;21:5630-5639. doi: 10.1016/j.csbj.2023.11.010. eCollection 2023.

Potential metabolic and genetic interaction among viruses, methanogen and methanotrophic archaea, and their syntrophic partners.病毒、产甲烷菌和甲烷营养古菌及其互营伙伴之间潜在的代谢和遗传相互作用。

ISME Commun. 2022 Jun 28;2(1):50. doi: 10.1038/s43705-022-00135-2.

DETIRE: a hybrid deep learning model for identifying viral sequences from metagenomes.DETIRE：一种用于从宏基因组中识别病毒序列的混合深度学习模型。

Front Microbiol. 2023 Jun 16;14:1169791. doi: 10.3389/fmicb.2023.1169791. eCollection 2023.

Genomic diversity and ecological distribution of marine phages.海洋噬菌体的基因组多样性与生态分布

Mar Life Sci Technol. 2023 Jan 20;5(2):271-285. doi: 10.1007/s42995-022-00160-z. eCollection 2023 May.

本文引用的文献

IMG/VR: a database of cultured and uncultured DNA Viruses and retroviruses.IMG/VR：一个包含培养和未培养的DNA病毒及逆转录病毒的数据库。

Nucleic Acids Res. 2017 Jan 4;45(D1):D457-D465. doi: 10.1093/nar/gkw1030. Epub 2016 Oct 30.

Genomes OnLine Database (GOLD) v.6: data updates and feature enhancements.基因组在线数据库（GOLD）第6版：数据更新与功能增强

Nucleic Acids Res. 2017 Jan 4;45(D1):D446-D456. doi: 10.1093/nar/gkw992. Epub 2016 Oct 27.

Prokaryotic Virus Orthologous Groups (pVOGs): a resource for comparative genomics and protein family annotation.原核病毒直系同源组（pVOGs）：用于比较基因组学和蛋白质家族注释的资源。

Nucleic Acids Res. 2017 Jan 4;45(D1):D491-D498. doi: 10.1093/nar/gkw975. Epub 2016 Oct 26.

IMG/M: integrated genome and metagenome comparative data analysis system.IMG/M：综合基因组与宏基因组比较数据分析系统

Nucleic Acids Res. 2017 Jan 4;45(D1):D507-D516. doi: 10.1093/nar/gkw929. Epub 2016 Oct 13.

Ecogenomics and potential biogeochemical impacts of globally abundant ocean viruses.全球丰富海洋病毒的生态基因组学及其潜在生物地球化学影响。

Nature. 2016 Sep 29;537(7622):689-693. doi: 10.1038/nature19366. Epub 2016 Sep 21.

Uncovering Earth's virome.揭示地球的病毒组。

Nature. 2016 Aug 25;536(7617):425-30. doi: 10.1038/nature19094. Epub 2016 Aug 17.

HostPhinder: A Phage Host Prediction Tool.宿主噬菌体查找器：一种噬菌体宿主预测工具。

Viruses. 2016 May 4;8(5):116. doi: 10.3390/v8050116.

PHASTER: a better, faster version of the PHAST phage search tool.PHASTER：PHAST噬菌体搜索工具的一个更好、更快的版本。

Nucleic Acids Res. 2016 Jul 8;44(W1):W16-21. doi: 10.1093/nar/gkw387. Epub 2016 May 3.

The iPlant Collaborative: Cyberinfrastructure for Enabling Data to Discovery for the Life Sciences.iPlant协作组织：用于推动生命科学领域从数据到发现的网络基础设施。

PLoS Biol. 2016 Jan 11;14(1):e1002342. doi: 10.1371/journal.pbio.1002342. eCollection 2016 Jan.

Computational approaches to predict bacteriophage-host relationships.预测噬菌体-宿主关系的计算方法。

FEMS Microbiol Rev. 2016 Mar;40(2):258-72. doi: 10.1093/femsre/fuv048. Epub 2015 Dec 9.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

非靶向病毒序列发现流程和宏基因组数据的病毒聚类

Nontargeted virus sequence discovery pipeline and virus clustering for metagenomic data.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献