McNair Katelyn, Aziz Ramy Karam, Pusch Gordon D, Overbeek Ross, Dutilh Bas E, Edwards Robert
Computational Sciences Research Center, San Diego State University, 5500 Campanile Drive, San Diego, CA, 92182, USA.
Department of Microbiology and Immunology, Faculty of Pharmacy, Cairo University, Cairo, 11562, Egypt.
Methods Mol Biol. 2018;1681:231-238. doi: 10.1007/978-1-4939-7343-9_17.
Phages are complex biomolecular machineries that have to survive in a bacterial world. Phage genomes show many adaptations to their lifestyle such as shorter genes, reduced capacity for redundant DNA sequences, and the inclusion of tRNAs in their genomes. In addition, phages are not free-living, they require a host for replication and survival. These unique adaptations provide challenges for the bioinformatics analysis of phage genomes. In particular, ORF calling, genome annotation, noncoding RNA (ncRNA) identification, and the identification of transposons and insertions are all complicated in phage genome analysis. We provide a road map through the phage genome annotation pipeline, and discuss the challenges and solutions for phage genome annotation as we have implemented in the rapid annotation using subsystems (RAST) pipeline.
噬菌体是复杂的生物分子机制,必须在细菌世界中生存。噬菌体基因组显示出许多对其生活方式的适应性,例如基因较短、冗余DNA序列的能力降低以及基因组中包含转运RNA。此外,噬菌体并非自由生活,它们需要宿主进行复制和生存。这些独特的适应性给噬菌体基因组的生物信息学分析带来了挑战。特别是,开放阅读框(ORF)的识别、基因组注释、非编码RNA(ncRNA)的鉴定以及转座子和插入序列的鉴定在噬菌体基因组分析中都很复杂。我们提供了一条贯穿噬菌体基因组注释流程的路线图,并讨论了我们在利用子系统快速注释(RAST)流程中实施的噬菌体基因组注释所面临的挑战和解决方案。