Weber Leah L, El-Kebir Mohammed
Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, IL, 61801, USA.
Algorithms Mol Biol. 2021 Jul 6;16(1):14. doi: 10.1186/s13015-021-00194-5.
Cancer arises from an evolutionary process where somatic mutations give rise to clonal expansions. Reconstructing this evolutionary process is useful for treatment decision-making as well as understanding evolutionary patterns across patients and cancer types. In particular, classifying a tumor's evolutionary process as either linear or branched and understanding what cancer types and which patients have each of these trajectories could provide useful insights for both clinicians and researchers. While comprehensive cancer phylogeny inference from single-cell DNA sequencing data is challenging due to limitations with current sequencing technology and the complexity of the resulting problem, current data might provide sufficient signal to accurately classify a tumor's evolutionary history as either linear or branched.
We introduce the Linear Perfect Phylogeny Flipping (LPPF) problem as a means of testing two alternative hypotheses for the pattern of evolution, which we prove to be NP-hard. We develop Phyolin, which uses constraint programming to solve the LPPF problem. Through both in silico experiments and real data application, we demonstrate the performance of our method, outperforming a competing machine learning approach.
Phyolin is an accurate, easy to use and fast method for classifying an evolutionary trajectory as linear or branched given a tumor's single-cell DNA sequencing data.
癌症源于一个体细胞突变引发克隆扩增的进化过程。重构这一进化过程对于治疗决策以及理解不同患者和癌症类型的进化模式都很有用。特别是,将肿瘤的进化过程分类为线性或分支状,并了解哪些癌症类型以及哪些患者具有这些轨迹中的每一种,可为临床医生和研究人员提供有用的见解。虽然由于当前测序技术的局限性以及由此产生问题的复杂性,从单细胞DNA测序数据进行全面的癌症系统发育推断具有挑战性,但当前的数据可能提供足够的信号来准确地将肿瘤的进化历史分类为线性或分支状。
我们引入线性完美系统发育翻转(LPPF)问题,作为检验进化模式的两种替代假设的一种方法,我们证明该问题是NP难问题。我们开发了Phyolin,它使用约束编程来解决LPPF问题。通过计算机模拟实验和实际数据应用,我们展示了我们方法的性能,优于一种竞争的机器学习方法。
给定肿瘤的单细胞DNA测序数据,Phyolin是一种准确、易用且快速的方法,用于将进化轨迹分类为线性或分支状。