Division of Mathematical Sciences, School of Physical and Mathematical Sciences, Nanyang Technological University, Singapore.
BMC Bioinformatics. 2011 Oct 5;12 Suppl 9(Suppl 9):S17. doi: 10.1186/1471-2105-12-S9-S17.
In this paper we study the problem of sorting unsigned genomes by double-cut-and-join operations, where genomes allow a mix of linear and circular chromosomes to be present. First, we formulate an equivalent optimization problem, called maximum cycle/path decomposition, which is aimed at finding a largest collection of edge-disjoint cycles/AA-paths/AB-paths in a breakpoint graph. Then, we show that the problem of finding a largest collection of edge-disjoint cycles/AA-paths/AB-paths of length no more than l can be reduced to the well-known degree-bounded k-set packing problem with k = 2l. Finally, a polynomial-time approximation algorithm for the problem of sorting unsigned genomes by double-cut-and-join operations is devised, which achieves the approximation ratio 13/9 + ε ≈ 1.4444 + ε, for any positive ε. For the restricted variation where each genome contains only one linear chromosome, the approximation ratio can be further improved to 69/49 + ε ≈ 1.4082 + ε.
在本文中,我们研究了通过双切割和连接操作对无符号基因组进行排序的问题,其中基因组允许存在混合的线性和圆形染色体。首先,我们提出了一个等价的优化问题,称为最大循环/路径分解,旨在找到断点图中最大的边不相交循环/AA-路径/AB-路径集合。然后,我们证明了找到长度不超过 l 的最大边不相交循环/AA-路径/AB-路径集合的问题可以归约为著名的度约束 k-集分组问题,其中 k = 2l。最后,设计了一种用于通过双切割和连接操作对无符号基因组进行排序的多项式时间近似算法,其逼近比为 13/9 + ε ≈ 1.4444 + ε,对于任意正 ε。对于每个基因组仅包含一条线性染色体的受限变体,逼近比可以进一步提高到 69/49 + ε ≈ 1.4082 + ε。