Liu Yushu, Edrisi Mohammadamin, Yan Zhi, A Ogilvie Huw, Nakhleh Luay
Department of Computer Science, Rice University, 6100 Main St, Houston, 77005, TX, USA.
Department of Genetics, University of Texas MD Anderson Cancer Center, TX, 77030, Houston, USA.
Algorithms Mol Biol. 2024 Apr 29;19(1):18. doi: 10.1186/s13015-024-00264-4.
Copy number aberrations (CNAs) are ubiquitous in many types of cancer. Inferring CNAs from cancer genomic data could help shed light on the initiation, progression, and potential treatment of cancer. While such data have traditionally been available via "bulk sequencing," the more recently introduced techniques for single-cell DNA sequencing (scDNAseq) provide the type of data that makes CNA inference possible at the single-cell resolution. We introduce a new birth-death evolutionary model of CNAs and a Bayesian method, NestedBD, for the inference of evolutionary trees (topologies and branch lengths with relative mutation rates) from single-cell data. We evaluated NestedBD's performance using simulated data sets, benchmarking its accuracy against traditional phylogenetic tools as well as state-of-the-art methods. The results show that NestedBD infers more accurate topologies and branch lengths, and that the birth-death model can improve the accuracy of copy number estimation. And when applied to biological data sets, NestedBD infers plausible evolutionary histories of two colorectal cancer samples. NestedBD is available at https://github.com/Androstane/NestedBD .
拷贝数畸变(CNAs)在多种癌症中普遍存在。从癌症基因组数据推断CNAs有助于揭示癌症的发生、发展及潜在治疗方法。虽然传统上可通过“批量测序”获得此类数据,但最近引入的单细胞DNA测序(scDNAseq)技术提供了能在单细胞分辨率下进行CNA推断的数据类型。我们引入了一种新的CNAs出生-死亡进化模型以及一种贝叶斯方法NestedBD,用于从单细胞数据推断进化树(拓扑结构、分支长度及相对突变率)。我们使用模拟数据集评估了NestedBD的性能,并将其准确性与传统系统发育工具以及最先进的方法进行了基准测试。结果表明,NestedBD能推断出更准确的拓扑结构和分支长度,且出生-死亡模型可提高拷贝数估计的准确性。当应用于生物数据集时,NestedBD推断出了两个结直肠癌样本合理的进化历史。可在https://github.com/Androstane/NestedBD获取NestedBD。