Kojima Kenji K
Genetic Information Research Institute.
Department of Life Sciences, National Cheng Kung University.
Genes Genet Syst. 2020 Jan 30;94(6):233-252. doi: 10.1266/ggs.18-00024. Epub 2018 Nov 9.
The majority of eukaryotic genomes contain a large fraction of repetitive sequences that primarily originate from transpositional bursts of transposable elements (TEs). Repbase serves as a database for eukaryotic repetitive sequences and has now become the largest collection of eukaryotic TEs. During the development of Repbase, many new superfamilies/lineages of TEs, which include Helitron, Polinton, Ginger and SINEU, were reported. The unique composition of protein domains and DNA motifs in TEs sometimes indicates novel mechanisms of transposition, replication, anti-suppression or proliferation. In this review, our current understanding regarding the diversity of eukaryotic TEs in sequence, protein domain composition and structural hallmarks is introduced and summarized, based on the classification system implemented in Repbase. Autonomous eukaryotic TEs can be divided into two groups: Class I TEs, also called retrotransposons, and Class II TEs, or DNA transposons. Long terminal repeat (LTR) retrotransposons, including endogenous retroviruses, non-LTR retrotransposons, tyrosine recombinase retrotransposons and Penelope-like elements, are well accepted groups of autonomous retrotransposons. They share reverse transcriptase for replication but are distinct in the catalytic components responsible for integration into the host genome. Similarly, at least three transposition machineries have been reported in eukaryotic DNA transposons: DDD/E transposase, tyrosine recombinase and HUH endonuclease combined with helicase. Among these, TEs with DDD/E transposase are dominant and are classified into 21 superfamilies in Repbase. Non-autonomous TEs are either simple derivatives generated by internal deletion, or are composed of several units that originated independently.
大多数真核生物基因组包含很大一部分重复序列,这些序列主要源自转座元件(TEs)的转座爆发。Repbase作为真核生物重复序列的数据库,现已成为真核生物TEs的最大集合。在Repbase的发展过程中,报道了许多新的TEs超家族/谱系,包括Helitron、Polinton、Ginger和SINEU。TEs中独特的蛋白质结构域和DNA基序组成有时表明了转座、复制、抗抑制或增殖的新机制。在这篇综述中,我们基于Repbase中实施的分类系统,介绍并总结了目前对真核生物TEs在序列、蛋白质结构域组成和结构特征方面多样性的理解。自主的真核生物TEs可分为两类:I类TEs,也称为逆转座子,和II类TEs,即DNA转座子。长末端重复序列(LTR)逆转座子,包括内源性逆转录病毒、非LTR逆转座子、酪氨酸重组酶逆转座子和Penelope样元件,是公认的自主逆转座子群体。它们共享逆转录酶进行复制,但在负责整合到宿主基因组的催化成分上有所不同。同样,在真核生物DNA转座子中至少报道了三种转座机制:DDD/E转座酶、酪氨酸重组酶和与解旋酶结合的HUH核酸内切酶。其中,具有DDD/E转座酶的TEs占主导地位,在Repbase中被分为21个超家族。非自主TEs要么是通过内部缺失产生的简单衍生物,要么由几个独立起源的单元组成。