Institute of Computer Science, University of Göttingen, Goldschmidtstrasse 7, Greifswald, Germany.
Bioinformatics. 2011 Mar 15;27(6):757-63. doi: 10.1093/bioinformatics/btr010. Epub 2011 Jan 6.
As improved DNA sequencing techniques have increased enormously the speed of producing new eukaryotic genome assemblies, the further development of automated gene prediction methods continues to be essential. While the classification of proteins into families is a task heavily relying on correct gene predictions, it can at the same time provide a source of additional information for the prediction, complementary to those presently used.
We extended the gene prediction software AUGUSTUS by a method that employs block profiles generated from multiple sequence alignments as a protein signature to improve the accuracy of the prediction. Equipped with profiles modelling human dynein heavy chain (DHC) proteins and other families, AUGUSTUS was run on the genomic sequences known to contain members of these families. Compared with AUGUSTUS' ab initio version, the rate of genes predicted with high accuracy showed a dramatic increase.
The AUGUSTUS project web page is located at http://augustus.gobics.de, with the executable program as well as the source code available for download.
随着改进的 DNA 测序技术极大地提高了产生新真核基因组组装的速度,自动化基因预测方法的进一步发展仍然是必不可少的。虽然蛋白质的分类主要依赖于正确的基因预测,但它同时可以为预测提供额外的信息来源,与目前使用的信息来源互补。
我们通过一种方法扩展了基因预测软件 AUGUSTUS,该方法使用来自多序列比对的块谱作为蛋白质特征,以提高预测的准确性。配备了模拟人类动力蛋白重链 (DHC) 蛋白和其他家族的谱,AUGUSTUS 被用于已知包含这些家族成员的基因组序列上。与 AUGUSTUS 的从头开始版本相比,高精度预测的基因率显示出了显著的增加。
AUGUSTUS 项目网页位于 http://augustus.gobics.de,可执行程序以及源代码均可下载。