Park Yongjin, Shackney Stanley, Schwartz Russell
Department of Biological Sciences, Carnegie Mellon University, Pittsburgh, PA 15213, USA.
IEEE/ACM Trans Comput Biol Bioinform. 2009 Apr-Jun;6(2):200-12. doi: 10.1109/TCBB.2008.126.
Cancer cells exhibit a common phenotype of uncontrolled cell growth, but this phenotype may arise from many different combinations of mutations. By inferring how cells evolve in individual tumors, a process called cancer progression, we may be able to identify important mutational events for different tumor types, potentially leading to new therapeutics and diagnostics. Prior work has shown that it is possible to infer frequent progression pathways by using gene expression profiles to estimate "distances" between tumors. Here, we apply gene network models to improve these estimates of evolutionary distance by controlling for correlations among coregulated genes. We test three variants of this approach: one using an optimized best-fit network, another using sampling to infer a high-confidence subnetwork, and one using a modular network inferred from clusters of similarly expressed genes. Application to lung cancer and breast cancer microarray data sets shows small improvements in phylogenies when correcting from the optimized network and more substantial improvements when correcting from the sampled or modular networks. Our results suggest that a network correction approach improves estimates of tumor similarity, but sophisticated network models are needed to control for the large hypothesis space and sparse data currently available.
癌细胞表现出不受控制的细胞生长这一共同表型,但这种表型可能源于许多不同的突变组合。通过推断细胞在个体肿瘤中的进化方式,即所谓的癌症进展过程,我们或许能够识别出不同肿瘤类型的重要突变事件,这有可能带来新的治疗方法和诊断手段。先前的研究表明,利用基因表达谱来估计肿瘤之间的“距离”,从而推断出常见的进展途径是可行的。在此,我们应用基因网络模型,通过控制共调控基因之间的相关性来改进这些进化距离的估计。我们测试了该方法的三种变体:一种使用优化的最佳拟合网络,另一种使用抽样来推断高可信度子网,还有一种使用从相似表达基因簇推断出的模块化网络。将其应用于肺癌和乳腺癌微阵列数据集时,从优化网络进行校正时,系统发育树有小幅改进;从抽样或模块化网络进行校正时,有更显著的改进。我们的结果表明,网络校正方法可改进肿瘤相似性的估计,但需要复杂的网络模型来控制当前可用的巨大假设空间和稀疏数据。