Dondi Riccardo, Lafond Manuel, El-Mabrouk Nadia
Dipartimento di Lettere, Filosofia, Comunicazione, Università degli Studi di Bergamo, Via Donizetti 3, 24129 Bergamo, Italy.
Department of Mathematics and Statistics, University of Ottawa, Ottawa, Canada.
Algorithms Mol Biol. 2017 Mar 11;12:4. doi: 10.1186/s13015-017-0096-x. eCollection 2017.
Given a gene family, the relations between genes (orthology/paralogy), are represented by a relation graph, where edges connect pairs of orthologous genes and "missing" edges represent paralogs. While a gene tree directly induces a relation graph, the converse is not always true. Indeed, a relation graph is not necessarily "satisfiable", i.e. does not necessarily correspond to a gene tree. And even if that holds, it may not be "consistent", i.e. the tree may not represent a true history in agreement with a species tree. Previous studies have addressed the problem of correcting a relation graph for satisfiability and consistency. Here we consider the weighted version of the problem, where a degree of confidence is assigned to each orthology or paralogy relation. We also consider a maximization variant of the unweighted version of the problem.
We provide complexity and algorithmic results for the approximation of the considered problems. We show that minimizing the correction of a weighted graph does not admit a constant factor approximation algorithm assuming the unique game conjecture, and we give an -approximation algorithm, being the number of vertices in the graph. We also provide polynomial time approximation schemes for the maximization variant for unweighted graphs.
We provided complexity and algorithmic results for variants of the problem of correcting a relation graph for satisfiability and consistency. For the maximization variants we were able to design polynomial time approximation schemes, while for the weighted minimization variants we were able to provide the first inapproximability results.
对于一个基因家族,基因之间的关系(直系同源/旁系同源)由一个关系图表示,其中边连接直系同源基因对,而“缺失”的边表示旁系同源基因。虽然基因树直接诱导出一个关系图,但反之则不一定成立。实际上,一个关系图不一定是“可满足的”,即不一定对应于一棵基因树。而且即使成立,它也可能不是“一致的”,即该树可能不代表与物种树一致的真实历史。先前的研究已经解决了校正关系图以使其可满足和一致的问题。在这里,我们考虑该问题的加权版本,其中为每个直系同源或旁系同源关系分配一个置信度。我们还考虑了该问题无加权版本的最大化变体。
我们给出了所考虑问题近似解的复杂度和算法结果。我们表明,假设唯一博弈猜想成立,最小化加权图的校正不允许有常数因子近似算法,并且我们给出了一个近似算法,其中是图中的顶点数。我们还为无加权图的最大化变体提供了多项式时间近似方案。
我们给出了校正关系图以使其可满足和一致问题变体的复杂度和算法结果。对于最大化变体,我们能够设计多项式时间近似方案,而对于加权最小化变体,我们能够给出首个不可近似性结果。