Department of Computer Science, City University of Hong Kong, 83 Tat Chee Ave., Hong Kong, People's Republic of China.
University of Hong Kong Shenzhen Research Institute, Shenzhen Hi-Tech Industrial Park, Nanshan District, Shenzhen, People's Republic of China.
BMC Bioinformatics. 2018 Aug 13;19(Suppl 9):291. doi: 10.1186/s12859-018-2268-1.
Genome rearrangements describe changes in the genetic linkage relationship of large chromosomal regions, involving reversals, transpositions, block interchanges, deletions, insertions, fissions, fusions and translocations etc. Many algorithms for calculating rearrangement scenarios between two genomes have been proposed. Very often, the calculated rearrangement scenario is not unique for the same pair of permutations. Hence, how to decide which calculated rearrangement scenario is more biologically meaningful becomes an essential task. Up to now, several mechanisms for genome rearrangements have been studied. One important theory is that genome rearrangement may be mediated by repeats, especially for reversal events. Many reversal regions are found to be flanked by a pair of inverted repeats. As a result, whether there are repeats at the breakpoints of the calculated rearrangement events can shed a light on deciding whether the calculated rearrangement events is biologically meaningful. To our knowledge, there is no tool which can automatically identify rearrangement events and check whether there exist repeats at the breakpoints of each calculated rearrangement event.
In this paper, we describe a new tool named GRSR which allows us to compare multiple unichromosomal genomes to identify "independent" (obvious) rearrangement events such as reversals, (inverted) block interchanges and (inverted) transpositions and automatically searches for repeats at the breakpoints of each rearrangement event. We apply our tool on the complete genomes of 28 Mycobacterium tuberculosis strains and 24 Shewanella strains respectively. In both Mycobacterium tuberculosis and Shewanella strains, our tool finds many reversal regions flanked by a pair of inverted repeats. In particular, the GRSR tool also finds an inverted transposition and an inverted block interchange in Shewanella, where the repeats at the ends of rearrangement regions remain unchanged after the rearrangement event. To our knowledge, this is the first time such a phenomenon for inverted transposition and inverted block interchange is reported in Shewanella.
From the calculated results, there are many examples supporting the theory that the existence of repeats at the breakpoints of a rearrangement event can make the sequences at the breakpoints remain unchanged before and after the rearrangement events, suggesting that the conservation of ends could possibly be a popular phenomenon in many types of genome rearrangement events.
基因组重排描述了大染色体区域遗传连锁关系的变化,涉及倒位、转座、块交换、缺失、插入、分裂、融合和易位等。已经提出了许多用于计算两个基因组之间重排场景的算法。通常,对于同一对排列,计算出的重排场景不是唯一的。因此,如何确定计算出的重排场景更具有生物学意义成为一项重要任务。到目前为止,已经研究了几种基因组重排的机制。一个重要的理论是,基因组重排可能由重复序列介导,特别是对于倒位事件。许多倒位区域被发现侧翼有一对反向重复序列。因此,计算出的重排事件断点处是否存在重复序列,可以说明计算出的重排事件是否具有生物学意义。据我们所知,没有工具可以自动识别重排事件,并检查每个计算出的重排事件的断点处是否存在重复序列。
在本文中,我们描述了一个名为 GRSR 的新工具,该工具允许我们比较多个单染色体基因组,以识别“独立”(明显)的重排事件,如倒位、(反向)块交换和(反向)转座,并自动搜索每个重排事件的断点处是否存在重复序列。我们将我们的工具应用于 28 株结核分枝杆菌和 24 株希瓦氏菌的完整基因组。在结核分枝杆菌和希瓦氏菌中,我们的工具都发现了许多侧翼有一对反向重复序列的倒位区域。特别是,GRSR 工具还在希瓦氏菌中发现了一个反向转位和一个反向块交换,其中重排区域末端的重复序列在重排事件后保持不变。据我们所知,这是首次在希瓦氏菌中报道这种反向转位和反向块交换的现象。
从计算结果来看,有许多例子支持这样一种理论,即重排事件断点处存在重复序列可以使重排事件前后的序列保持不变,这表明末端的保守性可能是许多类型的基因组重排事件中的一种普遍现象。