Kazazian H H, Wong C, Youssoufian H, Scott A F, Phillips D G, Antonarakis S E
Department of Pediatrics, Johns Hopkins University School of Medicine, Baltimore, Maryland 21205.
Nature. 1988 Mar 10;332(6160):164-6. doi: 10.1038/332164a0.
L1 sequences are a human-specific family of long, interspersed, repetitive elements, present as approximately 10(5) copies dispersed throughout the genome. The full-length L1 sequence is 6.1 kilobases, but the majority of L1 elements are truncated at the 5' end, resulting in a fivefold higher copy number of 3' sequences. The nucleotide sequence of L1 elements includes an A-rich 3' end and two long open reading frames (orf-1 and orf-2), the second of which encodes a potential polypeptide having sequence homology with the reverse transcriptases. This structure suggests that L1 elements represent a class of non-viral retrotransposons. A number of L1 complementary DNAs, including a nearly full-length element, have been isolated from an undifferentiated teratocarcinoma cell line. We now report insertions of L1 elements into exon 14 of the factor VIII gene in two of 240 unrelated patients with haemophilia A. Both of these insertions (3.8 and 2.3 kilobases respectively) contain 3' portions of the L1 sequence, including the poly (A) tract, and create target site duplications of at least 12 and 13 nucleotides of the factor VIII gene. In addition, their 3'-trailer sequences following orf-2 are nearly identical to the consensus sequence of L1 cDNAs (ref. 6). These results indicate that certain L1 sequences in man can be dispersed, presumably by an RNA intermediate, and cause disease by insertional mutation.
L1序列是人类特有的一类长散布重复元件,约有10(5)个拷贝分散于整个基因组中。全长L1序列为6.1千碱基,但大多数L1元件在5'端被截断,导致3'序列的拷贝数高出五倍。L1元件的核苷酸序列包括富含A的3'端和两个长开放阅读框(orf-1和orf-2),其中第二个编码与逆转录酶具有序列同源性的潜在多肽。这种结构表明L1元件代表一类非病毒逆转座子。已从一个未分化的畸胎瘤细胞系中分离出许多L1互补DNA,包括一个近乎全长的元件。我们现在报告在240例无关的甲型血友病患者中有两例的L1元件插入到因子VIII基因的第14外显子中。这两个插入片段(分别为3.8和2.3千碱基)均包含L1序列的3'部分,包括聚(A)尾,并在因子VIII基因中产生至少12和13个核苷酸的靶位点重复。此外,它们在orf-2之后的3'拖尾序列与L1 cDNA的共有序列几乎相同(参考文献6)。这些结果表明,人类中的某些L1序列可能通过RNA中间体进行散布,并通过插入突变导致疾病。