Dieppa-Colón Etan, Martin Cody, Kosmopoulos James C, Anantharaman Karthik
Department of Bacteriology, University of Wisconsin-Madison, Madison, WI, USA.
Microbiology Doctoral Training Program, University of Wisconsin-Madison, Madison, WI, USA.
Environ Microbiome. 2025 Jan 13;20(1):5. doi: 10.1186/s40793-024-00659-1.
Viruses that infect prokaryotes (phages) constitute the most abundant group of biological agents, playing pivotal roles in microbial systems. They are known to impact microbial community dynamics, microbial ecology, and evolution. Efforts to document the diversity, host range, infection dynamics, and effects of bacteriophage infection on host cell metabolism are extremely underexplored. Phages are classified as virulent or temperate based on their life cycles. Temperate phages adopt the lysogenic mode of infection, where the genome integrates into the host cell genome forming a prophage. Prophages enable viral genome replication without host cell lysis, and often contribute novel and beneficial traits to the host genome. Current phage research predominantly focuses on lytic phages, leaving a significant gap in knowledge regarding prophages, including their biology, diversity, and ecological roles.
Here we develop and describe Prophage-DB, a database of prophages, their proteins, and associated metadata that will serve as a resource for viral genomics and microbial ecology. To create the database, we identified and characterized prophages from genomes in three of the largest publicly available databases. We applied several state-of-the-art tools in our pipeline to annotate these viruses, cluster them, taxonomically classify them, and detect their respective auxiliary metabolic genes. In total, we identify and characterize over 350,000 prophages and 35,000 auxiliary metabolic genes. Our prophage database is highly representative based on statistical results and contains prophages from a diverse set of archaeal and bacterial hosts which show a wide environmental distribution.
Given that prophages are particularly overlooked and merit increased attention due to their vital implications for microbiomes and their hosts, we created Prophage-DB to advance our understanding of prophages in microbiomes through a comprehensive characterization of prophages in publicly available genomes. We propose that Prophage-DB will serve as a valuable resource for advancing phage research, offering insights into viral taxonomy, host relationships, auxiliary metabolic genes, and environmental distribution.
感染原核生物的病毒(噬菌体)是最丰富的生物制剂群体,在微生物系统中发挥着关键作用。已知它们会影响微生物群落动态、微生物生态学和进化。记录噬菌体的多样性、宿主范围、感染动态以及噬菌体感染对宿主细胞代谢的影响的工作极少得到探索。噬菌体根据其生命周期分为烈性噬菌体或温和噬菌体。温和噬菌体采用溶原性感染模式,其基因组整合到宿主细胞基因组中形成原噬菌体。原噬菌体能够在不裂解宿主细胞的情况下进行病毒基因组复制,并常常为宿主基因组贡献新的有益性状。当前的噬菌体研究主要集中在烈性噬菌体上,在原噬菌体的相关知识方面存在重大空白,包括它们的生物学特性、多样性和生态作用。
在此,我们开发并描述了原噬菌体数据库(Prophage-DB),这是一个关于原噬菌体、其蛋白质及相关元数据的数据库,将作为病毒基因组学和微生物生态学的资源。为创建该数据库,我们从三个最大的公开可用数据库中的基因组里识别并表征原噬菌体。我们在流程中应用了几种最先进的工具来注释这些病毒、对它们进行聚类、进行分类学分类并检测它们各自的辅助代谢基因。我们总共识别并表征了超过350,000个原噬菌体和35,000个辅助代谢基因。基于统计结果,我们的原噬菌体数据库具有高度代表性,包含来自各种古菌和细菌宿主的原噬菌体,这些宿主呈现出广泛的环境分布。
鉴于原噬菌体因其对微生物群落及其宿主的重要影响而特别被忽视且值得更多关注,我们创建了原噬菌体数据库,通过对公开可用基因组中的原噬菌体进行全面表征来推进我们对微生物群落中原噬菌体的理解。我们提出原噬菌体数据库将作为推进噬菌体研究的宝贵资源,为病毒分类学、宿主关系、辅助代谢基因和环境分布提供见解。