Suppr超能文献

宏基因组学与核苷酸组成辅助新型噬菌体序列的鉴定及其宿主关联分析

Mini-Metagenomics and Nucleotide Composition Aid the Identification and Host Association of Novel Bacteriophage Sequences.

作者信息

Deaton Jonathan, Yu Feiqiao Brian, Quake Stephen R

机构信息

Department of Bioengineering, Stanford University, 443 Via Ortega, Stanford, CA, 94305, USA.

Chan Zuckerberg Biohub, 499 Illinois St, San Francisco, CA, 94158, USA.

出版信息

Adv Biosyst. 2019 Nov;3(11):e1900108. doi: 10.1002/adbi.201900108. Epub 2019 Aug 16.

Abstract

A broad spectrum of metagenomic and single cell sequencing techniques have become popular for dissecting environmental microbial diversity, leading to the characterization of thousands of novel microbial lineages. In addition to recovering bacterial and archaeal genomes, metagenomic assembly can also produce genomes of viruses that infect microbial cells. Because of their diversity, lack of marker genes, and small genome size, identifying novel bacteriophage sequences from metagenomic data is often challenging, especially when the objective is to establish phage-host relationships. The present work describes a computational approach that uses supervised learning to classify metagenomic contigs as phage or non-phage as well as assigning phage taxonomy based on tetranucleotide frequencies. Furthermore, the method assigns phage-host relationships using co-occurrence statistics derived from a recently developed mini-metagenomic experimental technique. This work evaluates method performance at identifying viral contigs and predicting taxonomic classification using publicly available references. Then, using two mini-metagenomic datasets, over 100 novel phage contigs from hot spring samples of Yellowstone National Park are identified and assigned to putative microbial hosts. Results of this work demonstrate the value of combining viral sequence identification with mini-metagenomic experimental methods to understand the microbial ecosystem.

摘要

各种各样的宏基因组学和单细胞测序技术已广泛用于剖析环境微生物多样性,从而鉴定出数千个新的微生物谱系。除了获得细菌和古菌基因组外,宏基因组组装还能产生感染微生物细胞的病毒基因组。由于病毒具有多样性、缺乏标记基因且基因组较小,从宏基因组数据中识别新的噬菌体序列往往具有挑战性,尤其是在旨在建立噬菌体-宿主关系时。本研究描述了一种计算方法,该方法利用监督学习将宏基因组重叠群分类为噬菌体或非噬菌体,并根据四核苷酸频率对噬菌体进行分类学分类。此外,该方法还利用最近开发的微型宏基因组实验技术得出的共现统计数据来确定噬菌体-宿主关系。本研究利用公开可用的参考文献评估了该方法在识别病毒重叠群和预测分类学分类方面的性能。然后,利用两个微型宏基因组数据集,从黄石国家公园温泉样本中鉴定出100多个新的噬菌体重叠群,并将其与假定的微生物宿主进行匹配。本研究结果证明了将病毒序列鉴定与微型宏基因组实验方法相结合对于理解微生物生态系统的价值。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验