Craven M, Page D, Shavlik J, Bockhorst J, Glasner J
Dept. of Biostatistics & Medical Informatics, University of Wisconsin, Madison 53706, USA.
Proc Int Conf Intell Syst Mol Biol. 2000;8:116-27.
We present a computational approach to predicting operons in the genomes of prokaryotic organisms. Our approach uses machine learning methods to induce predictive models for this task from a rich variety of data types including sequence data, gene expression data, and functional annotations associated with genes. We use multiple learned models that individually predict promoters, terminators and operons themselves. A key part of our approach is a dynamic programming method that uses our predictions to map every known and putative gene in a given genome into its most probable operon. We evaluate our approach using data from the E. coli K-12 genome.
我们提出了一种用于预测原核生物基因组中操纵子的计算方法。我们的方法使用机器学习方法,从包括序列数据、基因表达数据以及与基因相关的功能注释等丰富多样的数据类型中,诱导出针对此任务的预测模型。我们使用多个学习模型,分别预测启动子、终止子以及操纵子本身。我们方法的一个关键部分是一种动态规划方法,该方法利用我们的预测结果,将给定基因组中每个已知和推定的基因映射到其最可能所属的操纵子中。我们使用来自大肠杆菌K - 12基因组的数据评估了我们的方法。