Teschendorff Andrew E, Severini Simone
Medical Genomics Group, Paul O'Gorman Building, UCL Cancer Institute, University College London, 72 Huntley Street, London WC1E 6BT, UK.
BMC Syst Biol. 2010 Jul 30;4:104. doi: 10.1186/1752-0509-4-104.
The statistical study of biological networks has led to important novel biological insights, such as the presence of hubs and hierarchical modularity. There is also a growing interest in studying the statistical properties of networks in the context of cancer genomics. However, relatively little is known as to what network features differ between the cancer and normal cell physiologies, or between different cancer cell phenotypes.
Based on the observation that frequent genomic alterations underlie a more aggressive cancer phenotype, we asked if such an effect could be detectable as an increase in the randomness of local gene expression patterns. Using a breast cancer gene expression data set and a model network of protein interactions we derive constrained weighted networks defined by a stochastic information flux matrix reflecting expression correlations between interacting proteins. Based on this stochastic matrix we propose and compute an entropy measure that quantifies the degree of randomness in the local pattern of information flux around single genes. By comparing the local entropies in the non-metastatic versus metastatic breast cancer networks, we here show that breast cancers that metastasize are characterised by a small yet significant increase in the degree of randomness of local expression patterns. We validate this result in three additional breast cancer expression data sets and demonstrate that local entropy better characterises the metastatic phenotype than other non-entropy based measures. We show that increases in entropy can be used to identify genes and signalling pathways implicated in breast cancer metastasis and provide examples of de-novo discoveries of gene modules with known roles in apoptosis, immune-mediated tumour suppression, cell-cycle and tumour invasion. Importantly, we also identify a novel gene module within the insulin growth factor signalling pathway, alteration of which may predispose the tumour to metastasize.
These results demonstrate that a metastatic cancer phenotype is characterised by an increase in the randomness of the local information flux patterns. Measures of local randomness in integrated protein interaction mRNA expression networks may therefore be useful for identifying genes and signalling pathways disrupted in one phenotype relative to another. Further exploration of the statistical properties of such integrated cancer expression and protein interaction networks will be a fruitful endeavour.
对生物网络的统计研究带来了重要的全新生物学见解,比如枢纽节点的存在和层次模块化。在癌症基因组学背景下研究网络的统计特性也越来越受到关注。然而,对于癌症与正常细胞生理状态之间,或不同癌细胞表型之间的网络特征差异,我们了解得还相对较少。
基于频繁的基因组改变是更具侵袭性的癌症表型的基础这一观察结果,我们提出疑问,即这种效应是否能作为局部基因表达模式随机性的增加而被检测到。利用一个乳腺癌基因表达数据集和一个蛋白质相互作用模型网络,我们推导出由反映相互作用蛋白质之间表达相关性的随机信息流矩阵定义的约束加权网络。基于这个随机矩阵,我们提出并计算了一种熵度量,它量化了单个基因周围局部信息流模式的随机程度。通过比较非转移性和转移性乳腺癌网络中的局部熵,我们在此表明,发生转移的乳腺癌的特征是局部表达模式的随机程度有微小但显著的增加。我们在另外三个乳腺癌表达数据集中验证了这一结果,并证明局部熵比其他非基于熵的度量更能表征转移表型。我们表明,熵的增加可用于识别与乳腺癌转移相关的基因和信号通路,并提供了在凋亡、免疫介导的肿瘤抑制、细胞周期和肿瘤侵袭中具有已知作用的基因模块的全新发现实例。重要的是,我们还在胰岛素生长因子信号通路中识别出一个新的基因模块,其改变可能使肿瘤易于发生转移。
这些结果表明,转移性癌症表型的特征是局部信息流模式的随机性增加。因此,在整合的蛋白质相互作用mRNA表达网络中测量局部随机性可能有助于识别相对于另一种表型而言在一种表型中被破坏的基因和信号通路。进一步探索此类整合的癌症表达和蛋白质相互作用网络的统计特性将是一项富有成果的工作。