IEEE/ACM Trans Comput Biol Bioinform. 2018 Jan-Feb;15(1):96-108. doi: 10.1109/TCBB.2016.2606620. Epub 2016 Oct 13.
Hajirasouliha and Raphael (WABI 2014) proposed a model for deconvoluting mixed tumor samples measured from a collection of high-throughput sequencing reads. This is related to understanding tumor evolution and critical cancer mutations. In short, their formulation asks to split each row of a binary matrix so that the resulting matrix corresponds to a perfect phylogeny and has the minimum number of rows among all matrices with this property. In this paper, we disprove several claims about this problem, including an NP-hardness proof of it. However, we show that the problem is indeed NP-hard, by providing a different proof. We also prove NP-completeness of a variant of this problem proposed in the same paper. On the positive side, we propose an efficient (though not necessarily optimal) heuristic algorithm based on coloring co-comparability graphs, and a polynomial time algorithm for solving the problem optimally on matrix instances in which no column is contained in both columns of a pair of conflicting columns. Implementations of these algorithms are freely available at https://github.com/alexandrutomescu/MixedPerfectPhylogeny.
哈吉拉苏利哈和拉斐尔(WABI 2014)提出了一种用于对高通量测序reads 集合中测量的混合肿瘤样本进行反卷积的模型。这与理解肿瘤进化和关键癌症突变有关。简而言之,他们的公式要求将二进制矩阵的每一行分割,使得生成的矩阵对应于完美的系统发育,并且在所有具有此属性的矩阵中具有最小行数。在本文中,我们反驳了有关该问题的几个主张,包括对其进行 NP 难证明。但是,我们通过提供不同的证明来证明该问题确实是 NP 难的。我们还证明了同一篇论文中提出的该问题的变体是 NP 完全的。从积极的方面来看,我们基于共可比较图的着色提出了一种有效的(尽管不一定是最优的)启发式算法,并提出了一种针对没有列包含在一对冲突列的两列中的矩阵实例的最优问题的多项式时间算法。这些算法的实现可在 https://github.com/alexandrutomescu/MixedPerfectPhylogeny 上免费获得。