Global Security Computing Applications Division, Lawrence Livermore National Laboratory, Fort Detrick, MD, USA.
Biological Defense Research Directorate, Naval Medical Research Center, Fort Detrick, MD, USA.
Bioinformatics. 2019 Nov 1;35(21):4402-4404. doi: 10.1093/bioinformatics/btz258.
To address the need for improved phage annotation tools that scale, we created an automated throughput annotation pipeline: multiple-genome Phage Annotation Toolkit and Evaluator (multiPhATE). multiPhATE is a throughput pipeline driver that invokes an annotation pipeline (PhATE) across a user-specified set of phage genomes. This tool incorporates a de novo phage gene calling algorithm and assigns putative functions to gene calls using protein-, virus- and phage-centric databases. multiPhATE's modular construction allows the user to implement all or any portion of the analyses by acquiring local instances of the desired databases and specifying the desired analyses in a configuration file. We demonstrate multiPhATE by annotating two newly sequenced Yersinia pestis phage genomes. Within multiPhATE, the PhATE processing pipeline can be readily implemented across multiple processors, making it adaptable for throughput sequencing projects. Software documentation assists the user in configuring the system.
multiPhATE was implemented in Python 3.7, and runs as a command-line code under Linux or Unix. multiPhATE is freely available under an open-source BSD3 license from https://github.com/carolzhou/multiPhATE. Instructions for acquiring the databases and third-party codes used by multiPhATE are included in the distribution README file. Users may report bugs by submitting to the github issues page associated with the multiPhATE distribution.
Supplementary data are available at Bioinformatics online.
为了解决需要改进的可扩展噬菌体注释工具,我们创建了一个自动化高通量注释管道:多基因组噬菌体注释工具和评估器(multiPhATE)。multiPhATE 是一个吞吐量管道驱动程序,它跨用户指定的一组噬菌体基因组调用注释管道(PhATE)。该工具采用了从头开始的噬菌体基因调用算法,并使用蛋白质、病毒和噬菌体中心数据库为基因调用分配可能的功能。multiPhATE 的模块化结构允许用户通过获取所需数据库的本地实例并在配置文件中指定所需的分析来实现所有或任何部分的分析。我们通过注释两个新测序的鼠疫耶尔森氏菌噬菌体基因组来演示 multiPhATE。在 multiPhATE 中,PhATE 处理管道可以轻松地在多个处理器上实现,使其适应高通量测序项目。软件文档可帮助用户配置系统。
multiPhATE 是用 Python 3.7 实现的,并在 Linux 或 Unix 下作为命令行代码运行。multiPhATE 根据开放源代码 BSD3 许可证免费提供,可从 https://github.com/carolzhou/multiPhATE 获得。获取 multiPhATE 使用的数据库和第三方代码的说明包含在发行版的 README 文件中。用户可以通过提交与 multiPhATE 发行版相关的 github 问题页面来报告错误。
补充数据可在《生物信息学》在线获得。