IEEE/ACM Trans Comput Biol Bioinform. 2017 Nov-Dec;14(6):1419-1433. doi: 10.1109/TCBB.2016.2598163. Epub 2016 Aug 3.
One of the most significant research issues in functional genomics is insilico identification of disease related genes. In this regard, the paper presents a new gene selection algorithm, termed as SiFS, for identification of disease genes. It integrates the information obtained from interaction network of proteins and gene expression profiles. The proposed SiFS algorithm culls out a subset of genes from microarray data as disease genes by maximizing both significance and functional similarity of the selected gene subset. Based on the gene expression profiles, the significance of a gene with respect to another gene is computed using mutual information. On the other hand, a new measure of similarity is introduced to compute the functional similarity between two genes. Information derived from the protein-protein interaction network forms the basis of the proposed SiFS algorithm. The performance of the proposed gene selection algorithm and new similarity measure, is compared with that of other related methods and similarity measures, using several cancer microarray data sets.
功能基因组学中最关键的研究问题之一是通过计算机在基因序列中识别与疾病相关的基因。在这一方面,本文提出了一种新的基因选择算法,称为 SiFS,用于识别疾病基因。该算法整合了从蛋白质相互作用网络和基因表达谱中获取的信息。所提出的 SiFS 算法通过最大化所选基因子集的显着性和功能相似性,从微阵列数据中挑选出一组基因作为疾病基因。基于基因表达谱,使用互信息计算一个基因相对于另一个基因的显着性。另一方面,引入了一种新的相似性度量来计算两个基因之间的功能相似性。来自蛋白质-蛋白质相互作用网络的信息构成了所提出的 SiFS 算法的基础。使用多个癌症微阵列数据集,比较了所提出的基因选择算法和新的相似性度量与其他相关方法和相似性度量的性能。