Yu Hui, Gao Lei, Tu Kang, Guo Zheng
Department of Bioinformatics, Harbin Medical University, Harbin 150086, China.
Gene. 2005 Jun 6;352:75-81. doi: 10.1016/j.gene.2005.03.033.
Previous studies on computational gene functional prediction have not fully exploited the taxonomy structure of Gene Ontology (GO). They just select a few classes from GO into a set, and conduct classwise learning of these classes. The pre-selection of learning classes, often done according to the annotation sizes, limits the prediction breadth and depth. This way of pre-selecting learning classes ignores the taxonomy relations among classes, and so wastes the valuable functional knowledge encoded in the DAG structure of GO. This paper proposes GESTS, a novel gene functional prediction approach based on both gene expression similarity and GO taxonomy similarity, which circumvents the problem of arbitrary learning class pre-selection. GESTS is a semi-supervised approach that reasonably and efficiently incorporates the ontology-formed gene functional knowledge into automated functional analyses of local gene clustering. By integrating both expression similarity and taxonomy similarity into the learning process, GESTS achieves better prediction breadth, depth, and precision than previous studies on the fibroblast serum response dataset and the yeast expression dataset.
以往关于计算基因功能预测的研究尚未充分利用基因本体论(GO)的分类结构。他们只是从GO中选择几个类别组成一个集合,并对这些类别进行逐类学习。学习类别的预选择通常根据注释大小进行,这限制了预测的广度和深度。这种预选择学习类别的方式忽略了类别之间的分类关系,因此浪费了编码在GO的有向无环图(DAG)结构中的宝贵功能知识。本文提出了GESTS,一种基于基因表达相似性和GO分类相似性的新型基因功能预测方法,该方法规避了任意学习类预选择的问题。GESTS是一种半监督方法,它合理有效地将本体形成的基因功能知识纳入局部基因聚类的自动功能分析中。通过将表达相似性和分类相似性都整合到学习过程中,与以往关于成纤维细胞血清反应数据集和酵母表达数据集的研究相比,GESTS在预测广度、深度和精度方面都取得了更好的效果。