Gauthier Christian H, Abad Lawrence, Venbakkam Ananya K, Malnak Julia, Russell Daniel A, Hatfull Graham F
Department of Biological Sciences, University of Pittsburgh, Pittsburgh, PA 15260, USA.
Nucleic Acids Res. 2022 Jul 22;50(13):e75. doi: 10.1093/nar/gkac273.
Advances in genome sequencing have produced hundreds of thousands of bacterial genome sequences, many of which have integrated prophages derived from temperate bacteriophages. These prophages play key roles by influencing bacterial metabolism, pathogenicity, antibiotic resistance, and defense against viral attack. However, they vary considerably even among related bacterial strains, and they are challenging to identify computationally and to extract precisely for comparative genomic analyses. Here, we describe DEPhT, a multimodal tool for prophage discovery and extraction. It has three run modes that facilitate rapid screening of large numbers of bacterial genomes, precise extraction of prophage sequences, and prophage annotation. DEPhT uses genomic architectural features that discriminate between phage and bacterial sequences for efficient prophage discovery, and targeted homology searches for precise prophage extraction. DEPhT is designed for prophage discovery in Mycobacterium genomes but can be adapted broadly to other bacteria. We deploy DEPhT to demonstrate that prophages are prevalent in Mycobacterium strains but are absent not only from the few well-characterized Mycobacterium tuberculosis strains, but also are absent from all ∼30 000 sequenced M. tuberculosis strains.
基因组测序技术的进步已产生了数十万条细菌基因组序列,其中许多都整合了源自温和噬菌体的原噬菌体。这些原噬菌体通过影响细菌代谢、致病性、抗生素抗性以及对病毒攻击的防御发挥关键作用。然而,即使在相关细菌菌株中,它们也存在很大差异,并且在计算上识别它们以及精确提取用于比较基因组分析都具有挑战性。在这里,我们描述了DEPhT,一种用于原噬菌体发现和提取的多模式工具。它有三种运行模式,便于快速筛选大量细菌基因组、精确提取原噬菌体序列以及进行原噬菌体注释。DEPhT利用区分噬菌体和细菌序列的基因组结构特征进行高效的原噬菌体发现,并通过靶向同源性搜索进行精确的原噬菌体提取。DEPhT专为在分枝杆菌基因组中发现原噬菌体而设计,但也可广泛应用于其他细菌。我们使用DEPhT来证明原噬菌体在分枝杆菌菌株中普遍存在,但不仅在少数特征明确的结核分枝杆菌菌株中不存在,而且在所有约30000条已测序的结核分枝杆菌菌株中也不存在。