Sawa Tomohiro, Ohno-Machado Lucila
Division of Health Sciences and Technology, Harvard Medical School and Massachusetts Institute of Technology, 77 Massachusetts Avenue, Cambridge, MA 02139, USA.
Comput Biol Med. 2003 Jan;33(1):1-15. doi: 10.1016/s0010-4825(02)00032-x.
A common approach to the analysis of gene expression data is to define clusters of genes that have similar expression. A critical step in cluster analysis is the determination of similarity between the expression levels of two genes. We introduce a neural network-based similarity index as a non-linear similarity index and compare the results with other proximity measures for Saccharomyces cerevisiae gene expression data. We show that the clusters obtained using Euclidean distance, correlation coefficients, and mutual information were not significantly different. The clusters formed with the neural network-based index were more in agreement with those defined by functional categories and common regulatory motifs.
分析基因表达数据的一种常见方法是定义具有相似表达的基因簇。聚类分析中的一个关键步骤是确定两个基因表达水平之间的相似性。我们引入基于神经网络的相似性指数作为非线性相似性指数,并将结果与酿酒酵母基因表达数据的其他邻近性度量进行比较。我们表明,使用欧几里得距离、相关系数和互信息获得的簇没有显著差异。基于神经网络指数形成的簇与由功能类别和常见调控基序定义的簇更一致。