Bioinformatics Program, Loyola University Chicago, Chicago, IL 60660, USA.
Department of Mathematics and Statistics, Loyola University Chicago, Chicago, IL 60660, USA.
Viruses. 2023 Feb 2;15(2):420. doi: 10.3390/v15020420.
High-throughput sequencing of microbial communities has uncovered a large, diverse population of phages. Frequently, phages found are integrated into their bacterial host genome. Distinguishing between phages in their integrated (lysogenic) and unintegrated (lytic) stage can provide insight into how phages shape bacterial communities. Here we present the Prophage Induction Estimator (PIE) to identify induced phages in genomic and metagenomic sequences. PIE takes raw sequencing reads and phage sequence predictions, performs read quality control, read assembly, and calculation of phage and non-phage sequence abundance and completeness. The distribution of abundances for non-phage sequences is used to predict induced phages with statistical confidence. In silico tests were conducted to benchmark this tool finding that PIE can detect induction events as well as phages with a relatively small burst size (10×). We then examined isolate genome sequencing data as well as a mock community and urinary metagenome data sets and found instances of induced phages in all three data sets. The flexibility of this software enables users to easily include phage predictions from their preferred tool of choice or phage sequences of interest. Thus, genomic and metagenomic sequencing now not only provides a means for discovering and identifying phage sequences but also the detection of induced prophages.
高通量测序技术揭示了大量多样的噬菌体群体。通常,发现的噬菌体整合到其细菌宿主基因组中。区分整合(溶源)和未整合(裂解)阶段的噬菌体可以深入了解噬菌体如何塑造细菌群落。本文提出了 Prophage Induction Estimator (PIE) 来鉴定基因组和宏基因组序列中的诱导噬菌体。PIE 采用原始测序reads 和噬菌体序列预测,进行reads 质量控制、reads 组装以及噬菌体和非噬菌体序列丰度和完整性的计算。非噬菌体序列丰度的分布用于具有统计置信度的预测诱导噬菌体。通过计算机模拟测试对该工具进行基准测试,发现 PIE 可以检测到诱导事件以及相对较小爆发大小(10×)的噬菌体。然后,我们检查了分离株基因组测序数据以及模拟群落和尿宏基因组数据集,并在这三个数据集都发现了诱导噬菌体的实例。该软件的灵活性使用户可以轻松地将首选工具的噬菌体预测或感兴趣的噬菌体序列包含在内。因此,基因组和宏基因组测序现在不仅提供了发现和鉴定噬菌体序列的手段,还可以检测诱导的前噬菌体。