Subramanian Ayshwarya, Shackney Stanley, Schwartz Russell
Department of Biological Sciences, Carnegie Mellon University, Pittsburgh, PA 15213, USA.
J Biomed Biotechnol. 2012;2012:797812. doi: 10.1155/2012/797812. Epub 2012 May 13.
Tumorigenesis can in principle result from many combinations of mutations, but only a few roughly equivalent sequences of mutations, or "progression pathways," seem to account for most human tumors. Phylogenetics provides a promising way to identify common progression pathways and markers of those pathways. This approach, however, can be confounded by the high heterogeneity within and between tumors, which makes it difficult to identify conserved progression stages or organize them into robust progression pathways. To tackle this problem, we previously developed methods for inferring progression stages from heterogeneous tumor profiles through computational unmixing. In this paper, we develop a novel pipeline for building trees of tumor evolution from the unmixed tumor data. The pipeline implements a statistical approach for identifying robust progression markers from unmixed tumor data and calling those markers in inferred cell states. The result is a set of phylogenetic characters and their assignments in progression states to which we apply maximum parsimony phylogenetic inference to infer tumor progression pathways. We demonstrate the full pipeline on simulated and real comparative genomic hybridization (CGH) data, validating its effectiveness and making novel predictions of major progression pathways and ancestral cell states in breast cancers.
肿瘤发生原则上可能由多种突变组合导致,但似乎只有少数大致等效的突变序列,即“进展途径”,能解释大多数人类肿瘤的发生。系统发育学为识别常见的进展途径及其标记物提供了一种很有前景的方法。然而,这种方法可能会因肿瘤内部和肿瘤之间的高度异质性而变得复杂,这使得难以识别保守的进展阶段或将它们组织成可靠的进展途径。为了解决这个问题,我们之前开发了通过计算解混从异质性肿瘤图谱推断进展阶段的方法。在本文中,我们开发了一种新颖的流程,用于从解混后的肿瘤数据构建肿瘤进化树。该流程实施一种统计方法,从解混后的肿瘤数据中识别可靠的进展标记物,并在推断的细胞状态中标记这些标记物。结果得到一组系统发育特征及其在进展状态中的赋值,我们对其应用最大简约系统发育推断来推断肿瘤进展途径。我们在模拟和真实的比较基因组杂交(CGH)数据上展示了完整的流程,验证了其有效性,并对乳腺癌的主要进展途径和祖细胞状态做出了新的预测。