Li Xiao-Li, Tan Yin-Chet, Ng See-Kiong
Knowledge Discovery Department, Institute for Infocomm Research, 21 Heng Mui Keng Terrace, Singapore 119613, Singapore.
BMC Bioinformatics. 2006 Dec 12;7 Suppl 4(Suppl 4):S23. doi: 10.1186/1471-2105-7-S4-S23.
Quantitative simultaneous monitoring of the expression levels of thousands of genes under various experimental conditions is now possible using microarray experiments. However, there are still gaps toward whole-genome functional annotation of genes using the gene expression data.
In this paper, we propose a novel technique called Fuzzy Nearest Clusters for genome-wide functional annotation of unclassified genes. The technique consists of two steps: an initial hierarchical clustering step to detect homogeneous co-expressed gene subgroups or clusters in each possibly heterogeneous functional class; followed by a classification step to predict the functional roles of the unclassified genes based on their corresponding similarities to the detected functional clusters.
Our experimental results with yeast gene expression data showed that the proposed method can accurately predict the genes' functions, even those with multiple functional roles, and the prediction performance is most independent of the underlying heterogeneity of the complex functional classes, as compared to the other conventional gene function prediction approaches.
利用微阵列实验现在可以在各种实验条件下对数千个基因的表达水平进行定量同步监测。然而,使用基因表达数据对基因进行全基因组功能注释仍存在差距。
在本文中,我们提出了一种名为模糊最近聚类的新技术,用于对未分类基因进行全基因组功能注释。该技术包括两个步骤:初始层次聚类步骤,以在每个可能异质的功能类别中检测同质共表达基因亚组或聚类;随后是分类步骤,根据未分类基因与检测到的功能聚类的相应相似性来预测其功能作用。
我们对酵母基因表达数据的实验结果表明,与其他传统基因功能预测方法相比,所提出的方法可以准确预测基因的功能,即使是具有多种功能作用的基因,并且预测性能最不受复杂功能类别的潜在异质性影响。