Hoff Katharina J, Lingner Thomas, Meinicke Peter, Tech Maike
Abteilung Bioinformatik, Institut für Mikrobiologie und Genetik, Georg-August-Universität Göttingen, Goldschmidtstr. 1, 37077 Göttingen, Germany.
Nucleic Acids Res. 2009 Jul;37(Web Server issue):W101-5. doi: 10.1093/nar/gkp327. Epub 2009 May 8.
Metagenomic sequencing projects yield numerous sequencing reads of a diverse range of uncultivated and mostly yet unknown microorganisms. In many cases, these sequencing reads cannot be assembled into longer contigs. Thus, gene prediction tools that were originally developed for whole-genome analysis are not suitable for processing metagenomes. Orphelia is a program for predicting genes in short DNA sequences that is available through a web server application (http://orphelia.gobics.de). Orphelia utilizes prediction models that were created with machine learning techniques on the basis of a wide range of annotated genomes. In contrast to other methods for metagenomic gene prediction, Orphelia has fragment length-specific prediction models for the two most popular sequencing techniques in metagenomics, chain termination sequencing and pyrosequencing. These models ensure highly specific gene predictions.
宏基因组测序项目产生了大量来自各种未培养且大多未知微生物的测序读数。在许多情况下,这些测序读数无法组装成更长的重叠群。因此,最初为全基因组分析开发的基因预测工具不适用于处理宏基因组。Orphelia是一个用于预测短DNA序列中基因的程序,可通过网络服务器应用程序(http://orphelia.gobics.de)获得。Orphelia利用基于广泛注释基因组通过机器学习技术创建的预测模型。与其他宏基因组基因预测方法不同,Orphelia针对宏基因组学中两种最流行的测序技术——链终止测序和焦磷酸测序,具有片段长度特异性预测模型。这些模型确保了高度特异性的基因预测。