Section of Virology, Department of Medical Sciences, Uppsala University, Uppsala, Sweden.
Mob DNA. 2013 Feb 1;4(1):5. doi: 10.1186/1759-8753-4-5.
Long terminal repeats (LTRs, consisting of U3-R-U5 portions) are important elements of retroviruses and related retrotransposons. They are difficult to analyse due to their variability.The aim was to obtain a more comprehensive view of structure, diversity and phylogeny of LTRs than hitherto possible.
Hidden Markov models (HMM) were created for 11 clades of LTRs belonging to Retroviridae (class III retroviruses), animal Metaviridae (Gypsy/Ty3) elements and plant Pseudoviridae (Copia/Ty1) elements, complementing our work with Orthoretrovirus HMMs. The great variation in LTR length of plant Metaviridae and the few divergent animal Pseudoviridae prevented building HMMs from both of these groups.Animal Metaviridae LTRs had the same conserved motifs as retroviral LTRs, confirming that the two groups are closely related. The conserved motifs were the short inverted repeats (SIRs), integrase recognition signals (5´TGTTRNR…YNYAACA 3´); the polyadenylation signal or AATAAA motif; a GT-rich stretch downstream of the polyadenylation signal; and a less conserved AT-rich stretch corresponding to the core promoter element, the TATA box. Plant Pseudoviridae LTRs differed slightly in having a conserved TATA-box, TATATA, but no conserved polyadenylation signal, plus a much shorter R region.The sensitivity of the HMMs for detection in genomic sequences was around 50% for most models, at a relatively high specificity, suitable for genome screening.The HMMs yielded consensus sequences, which were aligned by creating an HMM model (a 'Superviterbi' alignment). This yielded a phylogenetic tree that was compared with a Pol-based tree. Both LTR and Pol trees supported monophyly of retroviruses. In both, Pseudoviridae was ancestral to all other LTR retrotransposons. However, the LTR trees showed the chromovirus portion of Metaviridae clustering together with Pseudoviridae, dividing Metaviridae into two portions with distinct phylogeny.
The HMMs clearly demonstrated a unitary conserved structure of LTRs, supporting that they arose once during evolution. We attempted to follow the evolution of LTRs by tracing their functional foundations, that is, acquisition of RNAse H, a combined promoter/ polyadenylation site, integrase, hairpin priming and the primer binding site (PBS). Available information did not support a simple evolutionary chain of events.
长末端重复序列(LTRs,由 U3-R-U5 部分组成)是逆转录病毒和相关逆转座子的重要元件。由于其变异性,它们很难分析。目的是获得比以往更全面的 LTR 结构、多样性和系统发育视图。
为属于逆转录病毒科(III 类逆转录病毒)、动物 Metaviridae(Gypsy/Ty3)元件和植物 Pseudoviridae(Copia/Ty1)元件的 11 个 LTR 分支创建了隐马尔可夫模型(HMM),补充了我们的 Orthoretrovirus HMM 工作。植物 Metaviridae 的 LTR 长度变化很大,而动物 Pseudoviridae 的差异很大,这使得这两个组都无法构建 HMM。动物 Metaviridae 的 LTR 具有与逆转录病毒 LTR 相同的保守基序,证实了这两个组密切相关。保守基序是短反转重复序列(SIRs)、整合酶识别信号(5´TGTTRNR…YNYAACA 3´);多聚腺苷酸化信号或 AATAAA 基序;多聚腺苷酸化信号下游富含 GT 的延伸;和不太保守的富含 AT 的延伸,对应于核心启动子元件,TATA 盒。植物 Pseudoviridae 的 LTR 在具有保守的 TATA 盒 TATATA 方面略有不同,但没有保守的多聚腺苷酸化信号,加上 R 区短得多。HMM 对基因组序列检测的灵敏度约为大多数模型的 50%,特异性相对较高,适合基因组筛选。HMM 产生了共识序列,通过创建 HMM 模型(“Superviterbi”对齐)对其进行对齐。这产生了一个与基于 Pol 的树进行比较的系统发育树。LTR 和 Pol 树都支持逆转录病毒的单系发生。在这两种情况下,Pseudoviridae 都是所有其他 LTR 逆转座子的祖先。然而,LTR 树显示 Metaviridae 的 chromovirus 部分与 Pseudoviridae 聚类在一起,将 Metaviridae 分为具有不同系统发育的两部分。
HMM 清楚地表明了 LTR 的单一保守结构,支持它们是在进化过程中一次出现的。我们试图通过追踪它们的功能基础来追踪 LTR 的进化,即获得 RNAse H、组合启动子/多聚腺苷酸化位点、整合酶、发夹引物和引物结合位点(PBS)。现有信息不支持简单的进化事件链。