LIFL, UMR 8022 CNRS, Université Lille 1, INRIA Lille Nord Europe, Villeneuve d'Ascq, France.
BMC Bioinformatics. 2011 Oct 5;12 Suppl 9(Suppl 9):S20. doi: 10.1186/1471-2105-12-S9-S20.
Segmental duplications in genomes have been studied for many years. Recently, several studies have highlighted a biological phenomenon called breakpoint-duplication that apparently associates a significant proportion of segmental duplications in Mammals, and the Drosophila species group, to breakpoints in rearrangement events.
In this paper, we introduce and study a combinatorial problem, inspired from the breakpoint-duplication phenomenon, called the Genome Dedoubling Problem. It consists of finding a minimum length rearrangement scenario required to transform a genome with duplicated segments into a non-duplicated genome such that duplications are caused by rearrangement breakpoints. We show that the problem, in the Double-Cut-and-Join (DCJ) and the reversal rearrangement models, can be reduced to an APX-complete problem, and we provide algorithms for the Genome Dedoubling Problem with 2-approximable parts. We apply the methods for the reconstruction of a non-duplicated ancestor of Drosophila yakuba.
We present the Genome Dedoubling Problem, and describe two algorithms solving the problem in the DCJ model, and the reversal model. The usefulness of the problems and the methods are showed through an application to real Drosophila data.
基因组中的片段重复已被研究多年。最近,有几项研究强调了一种称为“断点重复”的生物学现象,该现象显然将哺乳动物和果蝇物种组中的大量片段重复与重排事件中的断点联系起来。
在本文中,我们介绍并研究了一个组合问题,该问题受断点重复现象启发,称为基因组双倍化问题。它包括找到一个最小长度的重排场景,将具有重复片段的基因组转换为非重复基因组,使得重复是由重排断点引起的。我们表明,在双切割和连接 (DCJ) 和反转重排模型中,该问题可以简化为一个 APX 完全问题,并且我们提供了具有 2-近似部分的基因组双倍化问题的算法。我们将该方法应用于重构果蝇 yakuba 的非重复祖先。
我们提出了基因组双倍化问题,并描述了在 DCJ 模型和反转模型中解决该问题的两种算法。通过对真实果蝇数据的应用,展示了这些问题和方法的实用性。