J. Craig Venter Institute, 9704 Medical Center Drive, Rockville, MD 20850 and Pacific Northwest National Laboratory, 902 Battelle Blvd., Richland, WA 99354, USA.
Bioinformatics. 2014 May 15;30(10):1469-70. doi: 10.1093/bioinformatics/btu051. Epub 2014 Jan 27.
We present the first public release of our proteogenomic annotation pipeline. We have previously used our original unreleased implementation to improve the annotation of 46 diverse prokaryotic genomes by discovering novel genes, post-translational modifications and correcting the erroneous annotations by analyzing proteomic mass-spectrometry data. This public version has been redesigned to run in a wide range of parallel Linux computing environments and provided with the automated configuration, build and testing facilities for easy deployment and portability.
Source code is freely available from https://bitbucket.org/andreyto/proteogenomics under GPL license. It is implemented in Python and C++. It bundles the Makeflow engine to execute the workflows.
我们发布了我们的蛋白质基因组注释流水线的第一个公开版本。我们之前曾使用原始的非公开实现,通过分析蛋白质组质谱数据,发现新基因、翻译后修饰并纠正错误注释,从而改进了 46 个不同的原核基因组的注释。这个公开版本经过重新设计,可以在广泛的并行 Linux 计算环境中运行,并提供了自动化的配置、构建和测试设施,便于部署和移植。
源代码可在 GPL 许可证下从 https://bitbucket.org/andreyto/proteogenomics 处免费获取。它是用 Python 和 C++实现的。它捆绑了 Makeflow 引擎来执行工作流程。