Wang Qing, Zhang Siyi, Pang Shichao, Zhang Menghuan, Wang Bo, Liu Qi, Li Jing
Department of Bioinformatics & Biostatistics, School of Life Science and Biotechnology, Shanghai Jiao Tong University, Shanghai, China.
Department of Bioinformatics & Biostatistics, School of Life Science and Biotechnology, Shanghai Jiao Tong University, Shanghai, China; Department of Biomedical Informatics, Vanderbilt University School of Medicine, Nashville, Tennessee, United States of America; Center for Quantitative Sciences, Vanderbilt University School of Medicine, Nashville, Tennessee, United States of America.
PLoS One. 2014 Oct 16;9(10):e110406. doi: 10.1371/journal.pone.0110406. eCollection 2014.
Many cell activities are organized as a network, and genes are clustered into co-expressed groups if they have the same or closely related biological function or they are co-regulated. In this study, based on an assumption that a strong candidate disease gene is more likely close to gene groups in which all members coordinately differentially express than individual genes with differential expression, we developed a novel disease gene prioritization method GroupRank by integrating gene co-expression and differential expression information generated from microarray data as well as PPI network. A candidate gene is ranked high using GroupRank if it is differentially expressed in disease and control or is close to differentially co-expressed groups in PPI network. We tested our method on data sets of lung, kidney, leukemia and breast cancer. The results revealed GroupRank could efficiently prioritize disease genes with significantly improved AUC value in comparison to the previous method with no consideration of co-expressed gene groups in PPI network. Moreover, the functional analyses of the major contributing gene group in gene prioritization of kidney cancer verified that our algorithm GroupRank not only ranks disease genes efficiently but also could help us identify and understand possible mechanisms in important physiological and pathological processes of disease.
许多细胞活动是以网络形式组织起来的,如果基因具有相同或密切相关的生物学功能,或者它们受到共同调控,那么这些基因就会被聚类到共表达组中。在本研究中,基于一个假设,即一个强有力的候选疾病基因更有可能靠近所有成员协同差异表达的基因组,而不是单个差异表达的基因,我们通过整合从微阵列数据以及蛋白质-蛋白质相互作用(PPI)网络中生成的基因共表达和差异表达信息,开发了一种新的疾病基因优先级排序方法GroupRank。如果一个候选基因在疾病组和对照组中差异表达,或者在PPI网络中靠近差异共表达组,那么使用GroupRank对其进行的排名就会很高。我们在肺癌、肾癌、白血病和乳腺癌的数据集上测试了我们的方法。结果表明,与之前未考虑PPI网络中共表达基因组的方法相比,GroupRank能够有效地对疾病基因进行优先级排序,显著提高了曲线下面积(AUC)值。此外,对肾癌基因优先级排序中主要贡献基因组的功能分析证实,我们的算法GroupRank不仅能有效地对疾病基因进行排名,还能帮助我们识别和理解疾病重要生理和病理过程中的可能机制。