Carnegie Mellon University, Pittsburgh.
Oncotherapeutics, Pittsburgh.
IEEE/ACM Trans Comput Biol Bioinform. 2013 Nov-Dec;10(6):1422-31. doi: 10.1109/TCBB.2013.33.
Computational cancer phylogenetics seeks to enumerate the temporal sequences of aberrations in tumor evolution, thereby delineating the evolution of possible tumor progression pathways, molecular subtypes, and mechanisms of action. We previously developed a pipeline for constructing phylogenies describing evolution between major recurring cell types computationally inferred from whole-genome tumor profiles. The accuracy and detail of the phylogenies, however, depend on the identification of accurate, high-resolution molecular markers of progression, i.e., reproducible regions of aberration that robustly differentiate different subtypes and stages of progression. Here, we present a novel hidden Markov model (HMM) scheme for the problem of inferring such phylogenetically significant markers through joint segmentation and calling of multisample tumor data. Our method classifies sets of genome-wide DNA copy number measurements into a partitioning of samples into normal (diploid) or amplified at each probe. It differs from other similar HMM methods in its design specifically for the needs of tumor phylogenetics, by seeking to identify robust markers of progression conserved across a set of copy number profiles. We show an analysis of our method in comparison to other methods on both synthetic and real tumor data, which confirms its effectiveness for tumor phylogeny inference and suggests avenues for future advances.
计算癌症系统发生学旨在列举肿瘤进化过程中异常的时间序列,从而描绘出可能的肿瘤进展途径、分子亚型和作用机制的进化。我们之前开发了一种用于构建系统发育树的管道,该系统发育树描述了从全基因组肿瘤图谱中计算推断出的主要反复出现的细胞类型之间的进化。然而,系统发育树的准确性和细节取决于对进展的准确、高分辨率分子标记的识别,即能够稳健地区分不同亚型和进展阶段的可重复的畸变区域。在这里,我们提出了一种新的隐马尔可夫模型 (HMM) 方案,用于通过联合分割和多样本肿瘤数据调用推断这种具有系统发生意义的标记。我们的方法将基因组范围内的 DNA 拷贝数测量数据集分类为样本在每个探针处的正常(二倍体)或扩增的分区。它与其他类似的 HMM 方法的不同之处在于,它是专门为肿瘤系统发生学的需求而设计的,旨在识别在一组拷贝数图谱中保守的稳健的进展标记。我们在合成和真实肿瘤数据上对我们的方法与其他方法进行了比较分析,这证实了它在肿瘤系统发育推断中的有效性,并为未来的发展提供了途径。