Lee Heewook, Doak Thomas G, Popodi Ellen, Foster Patricia L, Tang Haixu
School of Informatics and Computing, Indiana University, Bloomington, IN 47401, USA Computational Biology Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15213, USA.
Department of Biology, Indiana University, Bloomington, IN 47401, USA National Center for Genome Analysis Support, Indiana University, Bloomington, IN 47401, USA.
Nucleic Acids Res. 2016 Sep 6;44(15):7109-19. doi: 10.1093/nar/gkw647. Epub 2016 Jul 18.
A majority of large-scale bacterial genome rearrangements involve mobile genetic elements such as insertion sequence (IS) elements. Here we report novel insertions and excisions of IS elements and recombination between homologous IS elements identified in a large collection of Escherichia coli mutation accumulation lines by analysis of whole genome shotgun sequencing data. Based on 857 identified events (758 IS insertions, 98 recombinations and 1 excision), we estimate that the rate of IS insertion is 3.5 × 10(-4) insertions per genome per generation and the rate of IS homologous recombination is 4.5 × 10(-5) recombinations per genome per generation. These events are mostly contributed by the IS elements IS1, IS2, IS5 and IS186 Spatial analysis of new insertions suggest that transposition is biased to proximal insertions, and the length spectrum of IS-caused deletions is largely explained by local hopping. For any of the ISs studied there is no region of the circular genome that is favored or disfavored for new insertions but there are notable hotspots for deletions. Some elements have preferences for non-coding sequence or for the beginning and end of coding regions, largely explained by target site motifs. Interestingly, transposition and deletion rates remain constant across the wild-type and 12 mutant E. coli lines, each deficient in a distinct DNA repair pathway. Finally, we characterized the target sites of four IS families, confirming previous results and characterizing a highly specific pattern at IS186 target-sites, 5'-GGGG(N6/N7)CCCC-3'. We also detected 48 long deletions not involving IS elements.
大多数大规模细菌基因组重排涉及移动遗传元件,如插入序列(IS)元件。在此,我们报告了通过分析全基因组鸟枪法测序数据,在大量大肠杆菌突变积累系中鉴定出的IS元件的新插入和切除事件,以及同源IS元件之间的重组。基于857个已鉴定事件(758个IS插入、98个重组和1个切除),我们估计IS插入率为每代每个基因组3.5×10⁻⁴次插入,IS同源重组率为每代每个基因组4.5×10⁻⁵次重组。这些事件主要由IS元件IS1、IS2、IS5和IS186引起。新插入的空间分析表明,转座偏向于近端插入,IS导致的缺失的长度谱在很大程度上由局部跳跃解释。对于所研究的任何IS元件,环状基因组中没有哪个区域对新插入是有利或不利的,但存在明显的缺失热点。一些元件偏好非编码序列或编码区的起始和末端,这在很大程度上由靶位点基序解释。有趣的是,转座和缺失率在野生型和12个突变大肠杆菌系中保持恒定,每个系在不同的DNA修复途径中存在缺陷。最后,我们对四个IS家族的靶位点进行了表征,证实了先前的结果,并表征了IS186靶位点5'-GGGG(N6/N7)CCCC-3'处的高度特异性模式。我们还检测到48个不涉及IS元件的长缺失。