BMC Bioinformatics. 2013;14 Suppl 15(Suppl 15):S3. doi: 10.1186/1471-2105-14-S15-S3. Epub 2013 Oct 15.
The inversion distance, that is the distance between two unichromosomal genomes with the same content allowing only inversions of DNA segments, can be computed thanks to a pioneering approach of Hannenhalli and Pevzner in 1995. In 2000, El-Mabrouk extended the inversion model to allow the comparison of unichromosomal genomes with unequal contents, thus insertions and deletions of DNA segments besides inversions. However, an exact algorithm was presented only for the case in which we have insertions alone and no deletion (or vice versa), while a heuristic was provided for the symmetric case, that allows both insertions and deletions and is called the inversion-indel distance. In 2005, Yancopoulos, Attie and Friedberg started a new branch of research by introducing the generic double cut and join (DCJ) operation, that can represent several genome rearrangements (including inversions). Among others, the DCJ model gave rise to two important results. First, it has been shown that the inversion distance can be computed in a simpler way with the help of the DCJ operation. Second, the DCJ operation originated the DCJ-indel distance, that allows the comparison of genomes with unequal contents, considering DCJ, insertions and deletions, and can be computed in linear time.
In the present work we put these two results together to solve an open problem, showing that, when the graph that represents the relation between the two compared genomes has no bad components, the inversion-indel distance is equal to the DCJ-indel distance. We also give a lower and an upper bound for the inversion-indel distance in the presence of bad components.
倒置距离是指两个具有相同内容的单染色体基因组之间的距离,允许仅进行 DNA 片段的倒置。这可以通过 Hannenhalli 和 Pevzner 在 1995 年提出的开创性方法来计算。2000 年,El-Mabrouk 将倒置模型扩展到允许比较具有不同内容的单染色体基因组,从而允许 DNA 片段的插入和缺失以及倒置。然而,仅针对我们仅具有插入而没有缺失(或反之亦然)的情况提出了精确算法,而对于允许插入和缺失的对称情况提供了启发式算法,称为倒置插入/缺失距离。2005 年,Yancopoulos、Attie 和 Friedberg 通过引入通用的双切割和连接(DCJ)操作开辟了一个新的研究分支,该操作可以表示几种基因组重排(包括倒置)。除其他外,DCJ 模型产生了两个重要结果。首先,已经表明,借助 DCJ 操作可以更简单地计算倒置距离。其次,DCJ 操作产生了 DCJ 插入/缺失距离,该距离允许比较具有不同内容的基因组,同时考虑 DCJ、插入和缺失,并且可以在线性时间内计算。
在本工作中,我们将这两个结果结合在一起解决了一个开放性问题,表明当表示两个比较基因组之间关系的图没有不良组件时,倒置插入/缺失距离等于 DCJ 插入/缺失距离。我们还在存在不良组件的情况下给出了倒置插入/缺失距离的下界和上界。