Weiner Samson, Bansal Mukul S
School of Computing, University of Connecticut, Storrs, CT, USA.
School of Computing, University of Connecticut, Storrs, CT, USA
Life Sci Alliance. 2024 Dec 12;8(3). doi: 10.26508/lsa.202402923. Print 2025 Mar.
Somatic copy number alterations (sCNAs) are valuable phylogenetic markers for inferring evolutionary relationships among tumor cell subpopulations. Advances in single-cell DNA sequencing technologies are making it possible to obtain such sCNAs datasets at ever-larger scales. However, existing methods for reconstructing phylogenies from sCNAs are often too slow for large datasets. We propose two new distance-based methods, and , for reconstructing single-cell tumor phylogenies from sCNA data. Using carefully simulated datasets, we find that DICE-bar matches or exceeds the accuracies of all other methods on noise-free datasets and that DICE-star shows exceptional robustness to noise and outperforms all other methods on noisy datasets. Both methods are also orders of magnitude faster than many existing methods. Our experimental analysis also reveals how noise/error in copy number inference, as expected for real datasets, can drastically impact the accuracies of most methods. We apply DICE-star, the most accurate method on error-prone datasets, to several real single-cell breast and ovarian cancer datasets and find that it rapidly produces phylogenies of equivalent or greater reliability compared with existing methods.
体细胞拷贝数改变(sCNAs)是推断肿瘤细胞亚群间进化关系的重要系统发育标记。单细胞DNA测序技术的进步使得能够在越来越大的规模上获得此类sCNAs数据集。然而,现有的从sCNAs重建系统发育的方法对于大型数据集来说往往过于缓慢。我们提出了两种新的基于距离的方法,即 和 ,用于从sCNA数据重建单细胞肿瘤系统发育。通过精心模拟的数据集,我们发现DICE-bar在无噪声数据集上的准确率与或超过了所有其他方法,并且DICE-star对噪声具有出色的鲁棒性,在有噪声数据集上的表现优于所有其他方法。这两种方法的速度也比许多现有方法快几个数量级。我们的实验分析还揭示了正如真实数据集所预期的那样,拷贝数推断中的噪声/误差如何能极大地影响大多数方法的准确率。我们将DICE-star(在易出错数据集上最准确的方法)应用于几个真实的单细胞乳腺癌和卵巢癌数据集,发现与现有方法相比,它能快速生成可靠性相当或更高的系统发育树。