Xu Jian-Zhen, Guo Zheng, Zhang Min, Li Xia, Li Yong-Jin, Rao Shao-Qi
Department of Bioinformatics, Harbin Medical University, Harbin, China.
Mol Med. 2006 Jan-Mar;12(1-3):25-33. doi: 10.2119/2005-00036.Xu.
Discovering molecular heterogeneities in phenotypically defined disease is of critical importance both for understanding pathogenic mechanisms of complex diseases and for finding efficient treatments. Recently, it has been recognized that cellular phenotypes are determined by the concerted actions of many functionally related genes in modular fashions. The underlying modular mechanisms should help the understanding of hidden genetic heterogeneities of complex diseases. We defined a putative disease module to be the functional gene groups in terms of both biological process and cellular localization, which are significantly enriched with genes highly variably expressed across the disease samples. As a validation, we used two large cancer datasets to evaluate the ability of the modules for correctly partitioning samples. Then, we sought the subtypes of complex diffuse large B-cell lymphoma (DLBCL) using a public dataset. Finally, the clinical significance of the identified subtypes was verified by survival analysis. In two validation datasets, we achieved highly accurate partitions that best fit the clinical cancer phenotypes. Then, for the notoriously heterogeneous DLBCL, we demonstrated that two partitioned subtypes using an identified module ("cellular response to stress") had very different 5-year overall rates (65% vs. 14%) and were highly significantly (P < 0.007) correlated with the clinical survival rate. Finally, we built a multivariate Cox proportional-hazard prediction model that included 4 genes as risk predictors for survival over DLBCL. The proposed modular approach is a promising computational strategy for peeling off genetic heterogeneities and understanding the modular mechanisms of human diseases such as cancers.
在表型定义的疾病中发现分子异质性,对于理解复杂疾病的致病机制以及寻找有效的治疗方法都至关重要。最近,人们认识到细胞表型是由许多功能相关基因以模块化方式协同作用所决定的。潜在的模块化机制应有助于理解复杂疾病隐藏的遗传异质性。我们将一个假定的疾病模块定义为在生物学过程和细胞定位方面的功能基因组,这些基因组在疾病样本中高度可变表达的基因显著富集。作为验证,我们使用了两个大型癌症数据集来评估模块正确划分样本的能力。然后,我们使用一个公共数据集寻找复杂弥漫性大B细胞淋巴瘤(DLBCL)的亚型。最后,通过生存分析验证了所识别亚型的临床意义。在两个验证数据集中,我们实现了高度准确的划分,最符合临床癌症表型。然后,对于众所周知的异质性DLBCL,我们证明使用一个识别出的模块(“细胞对应激的反应”)划分的两个亚型具有非常不同的5年总生存率(65%对14%),并且与临床生存率高度显著相关(P < 0.007)。最后,我们建立了一个多变量Cox比例风险预测模型,其中包括4个基因作为DLBCL生存的风险预测因子。所提出的模块化方法是一种很有前景的计算策略,用于揭示遗传异质性并理解人类疾病如癌症的模块化机制。