Daron Josquin, Glover Natasha, Pingault Lise, Theil Sébastien, Jamilloux Véronique, Paux Etienne, Barbe Valérie, Mangenot Sophie, Alberti Adriana, Wincker Patrick, Quesneville Hadi, Feuillet Catherine, Choulet Frédéric
Genome Biol. 2014;15(12):546. doi: 10.1186/s13059-014-0546-4.
The 17 Gb bread wheat genome has massively expanded through the proliferation of transposable elements (TEs) and two recent rounds of polyploidization. The assembly of a 774 Mb reference sequence of wheat chromosome 3B provided us with the opportunity to explore the impact of TEs on the complex wheat genome structure and evolution at a resolution and scale not reached so far.
We develop an automated workflow, CLARI-TE, for TE modeling in complex genomes. We delineate precisely 56,488 intact and 196,391 fragmented TEs along the 3B pseudomolecule, accounting for 85% of the sequence, and reconstruct 30,199 nested insertions. TEs have been mostly silent for the last one million years, and the 3B chromosome has been shaped by a succession of bursts that occurred between 1 to 3 million years ago. Accelerated TE elimination in the high-recombination distal regions is a driving force towards chromosome partitioning. CACTAs overrepresented in the high-recombination distal regions are significantly associated with recently duplicated genes. In addition, we identify 140 CACTA-mediated gene capture events with 17 genes potentially created by exon shuffling and show that 19 captured genes are transcribed and under selection pressure, suggesting the important role of CACTAs in the recent wheat adaptation.
Accurate TE modeling uncovers the dynamics of TEs in a highly complex and polyploid genome. It provides novel insights into chromosome partitioning and highlights the role of CACTA transposons in the high level of gene duplication in wheat.
17Gb的普通小麦基因组通过转座元件(TEs)的增殖和最近两轮多倍体化而大规模扩张。小麦3B染色体774Mb参考序列的组装为我们提供了一个机会,以目前尚未达到的分辨率和规模来探索TEs对复杂小麦基因组结构和进化的影响。
我们开发了一种自动化流程CLARI-TE,用于在复杂基因组中进行TE建模。我们沿着3B假分子精确描绘了56488个完整的和196391个片段化的TEs,占序列的85%,并重建了30199个嵌套插入。在过去的一百万年里,TEs大多处于沉默状态,3B染色体是由100万到300万年前发生的一系列爆发所塑造的。高重组远端区域中加速的TE消除是染色体分区的驱动力。在高重组远端区域中过度富集的CACTAs与最近复制的基因显著相关。此外,我们鉴定出14个由CACTAs介导的基因捕获事件,其中17个基因可能是通过外显子洗牌产生的,并表明19个捕获的基因被转录且处于选择压力下,这表明CACTAs在小麦近期适应性中发挥了重要作用。
准确的TE建模揭示了高度复杂的多倍体基因组中TEs的动态变化。它为染色体分区提供了新的见解,并突出了CACTAs转座子在小麦高水平基因复制中的作用。