Bringaud Frédéric, Bartholomeu Daniella C, Blandin Gaëlle, Delcher Arthur, Baltz Théo, El-Sayed Najib M A, Ghedin Elodie
Laboratoire de Génomique Fonctionnelle des Trypanosomatides, UMR-5162 Centre National de la Recherche Scientifique, Université Victor Segalen Bordeaux 2, Bordeaux Cedex, France.
Mol Biol Evol. 2006 Feb;23(2):411-20. doi: 10.1093/molbev/msj046. Epub 2005 Nov 2.
The trypanosomatid protozoan Trypanosoma cruzi contains long autonomous (L1Tc) and short nonautonomous (NARTc) non-long terminal repeat retrotransposons. NARTc (0.25 kb) probably derived from L1Tc (4.9 kb) by 3'-deletion. It has been proposed that their apparent random distribution in the genome is related to the L1Tc-encoded apurinic/apyrimidinic endonuclease (APE) activity, which repairs modified residues. To address this question we used the T. cruzi (CL-Brener strain) genome data to analyze the distribution of all the L1Tc/NARTc elements present in contigs larger than 10 kb. This data set, which represents 0.91x sequence coverage of the haploid nuclear genome ( approximately 55 Mb), contains 419 elements, including 112 full-length L1Tc elements (14 of which are potentially functional) and 84 full-length NARTc. Approximately half of the full-length elements are flanked by a target site duplication, most of them (87%) are 12 bp long. Statistical analyses of sequences flanking the full-length elements show the same highly conserved pattern upstream of both the L1Tc and NARTc retrotransposons. The two most conserved residues are a guanine and an adenine, which flank the site where first-strand cleavage is performed by the element-encoded endonuclease activity. This analysis clearly indicates that the L1Tc and NARTc elements display relative site specificity for insertion, which suggests that the APE activity is not responsible for first-strand cleavage of the target site.
锥虫类原生动物克氏锥虫含有长自主型(L1Tc)和短非自主型(NARTc)非长末端重复逆转座子。NARTc(0.25 kb)可能是L1Tc(4.9 kb)通过3'端缺失衍生而来。有人提出,它们在基因组中明显的随机分布与L1Tc编码的脱嘌呤/脱嘧啶内切核酸酶(APE)活性有关,该酶可修复修饰后的残基。为解决这个问题,我们利用克氏锥虫(CL-Brener株)基因组数据来分析大于10 kb的重叠群中存在的所有L1Tc/NARTc元件的分布。该数据集代表单倍体核基因组(约55 Mb)的0.91倍序列覆盖度,包含419个元件,其中包括112个全长L1Tc元件(其中14个可能具有功能)和84个全长NARTc元件。大约一半的全长元件两侧有靶位点重复序列,其中大多数(87%)长度为12 bp。对全长元件侧翼序列的统计分析表明,L1Tc和NARTc逆转座子上游都具有相同的高度保守模式。两个最保守的残基是鸟嘌呤和腺嘌呤,它们位于由元件编码的内切核酸酶活性进行第一链切割的位点两侧。该分析清楚地表明,L1Tc和NARTc元件在插入时表现出相对位点特异性,这表明APE活性与靶位点的第一链切割无关。