Ahmed Hasin Afzal, Mahanta Priyakshi, Bhattacharyya Dhruba Kumar, Kalita Jugal Kumar
IEEE/ACM Trans Comput Biol Bioinform. 2014 Nov-Dec;11(6):1239-52. doi: 10.1109/TCBB.2014.2323054.
The existence of various types of correlations among the expressions of a group of biologically significant genes poses challenges in developing effective methods of gene expression data analysis. The initial focus of computational biologists was to work with only absolute and shifting correlations. However, researchers have found that the ability to handle shifting-and-scaling correlation enables them to extract more biologically relevant and interesting patterns from gene microarray data. In this paper, we introduce an effective shifting-and-scaling correlation measure named Shifting and Scaling Similarity (SSSim), which can detect highly correlated gene pairs in any gene expression data. We also introduce a technique named Intensive Correlation Search (ICS) biclustering algorithm, which uses SSSim to extract biologically significant biclusters from a gene expression data set. The technique performs satisfactorily with a number of benchmarked gene expression data sets when evaluated in terms of functional categories in Gene Ontology database.
一组具有生物学意义的基因表达之间存在的各种类型的相关性,给开发有效的基因表达数据分析方法带来了挑战。计算生物学家最初关注的只是绝对相关性和移动相关性。然而,研究人员发现,处理移动和缩放相关性的能力使他们能够从基因微阵列数据中提取更多具有生物学相关性和趣味性的模式。在本文中,我们引入了一种名为移动和缩放相似性(SSSim)的有效移动和缩放相关性度量,它可以在任何基因表达数据中检测高度相关的基因对。我们还引入了一种名为密集相关性搜索(ICS)双聚类算法的技术,该算法使用SSSim从基因表达数据集中提取具有生物学意义的双聚类。当根据基因本体数据库中的功能类别进行评估时,该技术在多个基准基因表达数据集上表现良好。