Kajikawa M, Ohshima K, Okada N
Faculty of Bioscience and Biotechnology, Tokyo Institute of Technology, Japan.
Mol Biol Evol. 1997 Dec;14(12):1206-17. doi: 10.1093/oxfordjournals.molbev.a025730.
CR1 elements are a family of retroposons. They are classified as long interspersed elements (LINEs) or non-long-terminal-repeat (non-LTR) retrotransposons, and they have been found in the genomes of many vertebrates. However, they have been only partially characterized, and only a 2-kb region of the 3' end of chicken CR1 has been sequenced. In the present study, we determined the entire consensus sequence of CR1 elements in the turtle genome, designated PsCR1. The first open reading frame (ORF1) of PsCR1 has two unusual arrangements of Cys residues. One of them includes a zinc finger motif, CX2CX14CX2C. The putative zinc finger has cysteine residues with identical spacing and a similar amino acid composition to those found in the species-specific transcription initiation factors SL1 and TIF-IB. The 5' untranslated region (5' UTR) of PsCR1 contains a sequence similar to part of the human L1 promoter, L1 site A, and several cis elements of the type found in eukaryotic genes. Within a region of about 500 bp, there are nine "E boxes," cis elements that are recognized by the basic helix-loop-helix (bHLH) family of proteins. This observation raises the possibility that cellular transcription factors that bind to these sequences might act in concert to regulate the expression of PsCR1. The extent of the sequence divergence of the 3' UTR of CR1 between species was found to be lower than the rate of nonsynonymous substitutions per site in ORF2, suggesting that a strict functional constraint must exist for this region. This result strongly suggests that the conserved 3'-end sequence of CR1 is the recognition site for the reverse transcriptase of CR1. A discussion is presented of a possible mechanism for the integration of CR1 elements and also of the intriguing possible recruitment of the reverse transcriptase for the retroposition of SINEs.
CR1元件是逆转座子家族。它们被归类为长散在元件(LINEs)或非长末端重复(non-LTR)逆转录转座子,并且已在许多脊椎动物的基因组中被发现。然而,它们仅得到部分表征,鸡CR1 3'端的一个2 kb区域已被测序。在本研究中,我们确定了龟基因组中CR1元件的完整共有序列,命名为PsCR1。PsCR1的第一个开放阅读框(ORF1)具有两种不寻常的半胱氨酸残基排列。其中一种包含锌指基序,即CX2CX14CX2C。推定的锌指具有与物种特异性转录起始因子SL1和TIF-IB中发现的半胱氨酸残基间距相同且氨基酸组成相似的半胱氨酸残基。PsCR1的5'非翻译区(5'UTR)包含一个与人类L1启动子的一部分、L1位点A以及真核基因中发现的几种顺式元件相似的序列。在大约500 bp的区域内,有九个“E框”顺式元件,它们可被碱性螺旋-环-螺旋(bHLH)蛋白家族识别。这一观察结果增加了与这些序列结合的细胞转录因子可能协同作用以调节PsCR1表达的可能性。发现CR1物种间3'UTR的序列分歧程度低于ORF2中每个位点非同义替换的速率, 表明该区域必须存在严格的功能限制。这一结果强烈表明,CR1保守的3'端序列是CR1逆转录酶的识别位点,并讨论了CR1元件整合的可能机制以及逆转录酶可能参与SINEs逆转座的有趣现象。