Cancer Data Science Laboratory, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, Maryland, USA.
Department of Computer Science, Indiana University, Bloomington, Indiana, USA.
J Comput Biol. 2021 Sep;28(9):857-879. doi: 10.1089/cmb.2020.0595. Epub 2021 Jul 22.
Single-cell sequencing (SCS) data have great potential in reconstructing the evolutionary history of tumors. Rapid advances in SCS technology in the past decade were followed by the design of various computational methods for inferring trees of tumor evolution. Some of the earliest methods were based on the direct search in the space of trees with the goal of finding the maximum likelihood tree. However, it can be shown that instead of searching directly in the tree space, we can perform a search in the space of binary matrices and obtain maximum likelihood tree directly from the maximum likelihood matrix. The potential of the latter tree search strategy has recently been recognized by different research groups and several related methods were published in the past 2 years. Here we provide a review of the theoretical background of these methods and a detailed discussion, which are largely missing in the available publications, of the correlation between the two tree search strategies. We also discuss each of the existing methods based on the search in the space of binary matrices and summarize the best-known single-cell DNA sequencing data sets, which can be used in the future for assessing performance on real data of newly developed methods.
单细胞测序 (SCS) 数据在重建肿瘤的进化历史方面具有巨大的潜力。在过去十年中,SCS 技术的快速发展催生了各种用于推断肿瘤进化树的计算方法。最早的一些方法是基于直接在树空间中搜索,目标是找到最大似然树。然而,可以证明,我们可以在二叉矩阵空间中进行搜索,而不是直接在树空间中搜索,并直接从最大似然矩阵中获得最大似然树。这种树搜索策略的潜力最近已被不同的研究小组所认识,并且在过去 2 年中发表了几个相关的方法。在这里,我们提供了对这些方法的理论背景的回顾,并对两种树搜索策略之间的相关性进行了详细的讨论,而这在现有的出版物中很大程度上是缺失的。我们还基于二叉矩阵空间搜索对现有的每种方法进行了讨论,并总结了最知名的单细胞 DNA 测序数据集,这些数据集可用于未来评估新开发方法在真实数据上的性能。