Bioinformatics Centre, Key Laboratory for NeuroInformation of Ministry of Education and School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, 610054, China.
Gene. 2013 Sep 10;526(2):232-8. doi: 10.1016/j.gene.2013.05.011. Epub 2013 May 22.
In microarray-based case-control studies of a disease, people often attempt to identify a few diagnostic or prognostic markers amongst the most significant differentially expressed (DE) genes. However, the reproducibility of DE genes identified in different studies for a disease is typically very low. To tackle the problem, we could evaluate the reproducibility of DE genes across studies and define robust markers for disease diagnosis using disease-associated protein-protein interaction (PPI) subnetwork. Using datasets for four cancer types, we found that the most significant DE genes in cancer exhibit consistent up- or down-regulation in different datasets. For each cancer type, the 5 (or 10) most significant DE genes separately extracted from different datasets tend to be significantly coexpressed and closely connected in the PPI subnetwork, thereby indicating that they are highly reproducible at the PPI level. Consequently, we were able to build robust subnetwork-based classifiers for cancer diagnosis.
在基于微阵列的疾病病例对照研究中,人们通常试图在最显著差异表达(DE)基因中识别出少数诊断或预后标志物。然而,不同研究中为一种疾病识别出的 DE 基因的重现性通常非常低。为了解决这个问题,我们可以评估 DE 基因在不同研究中的重现性,并使用与疾病相关的蛋白质-蛋白质相互作用(PPI)子网络来定义稳健的疾病诊断标志物。使用四个癌症类型的数据集,我们发现癌症中最显著的 DE 基因在不同数据集之间表现出一致的上调或下调。对于每种癌症类型,分别从不同数据集中提取的 5(或 10)个最显著的 DE 基因在 PPI 子网络中往往表现出显著的共表达和紧密连接,这表明它们在 PPI 水平上具有高度的重现性。因此,我们能够构建基于稳健子网的癌症诊断分类器。