Romano C M, Ramalho R F, Zanotto P M de A
Laboratory of Molecular Evolution and Bioinformatics, Department of Microbiology, Biomedical Sciences Institute - ICB II, University of São Paulo, São Paulo, Brazil.
Arch Virol. 2006 Nov;151(11):2215-28. doi: 10.1007/s00705-006-0792-1. Epub 2006 Jul 10.
Several families of endogenous retrovirus (ERV) exist in copious numbers in the genomes of primate species. Therefore, we undertook a systematic search for endogenous retrovirus sequences from the ERV-K family, comparing across both human (Homo sapiens) and chimpanzee (Pan troglodytes) genomes. Using conserved motifs of the ERV-K as query we identified and characterized 76 complete ERV-K elements, 54 in human (HERV-K), 34 of which were described previously, and 21 in the chimpanzee (CERV-K). Phylogenetic analysis using coding regions and LTRs showed the existence of two main branches. Group I was the most heterogeneous and had an average integration time of 18.3 MYBP (million years before present), using rates ranging from 1.5 to 4.0 x 10(-9) s/s/y (substitution per site per year). Group O/N integrated around 19.4 MYBP and nested Group N integrated about 14 MYBP. We found evidence for strong positive selection on the gag, pol and env coding regions and for A/T hypermutation. Our data suggest that the endogenous elements were possibly involved in chromosomal rearrangements and retained a great deal of information from their active stage, most likely as a consequence of host interactions. This study also contributes to the annotation effort of both human and chimpanzee genomes.
几种内源性逆转录病毒(ERV)家族大量存在于灵长类物种的基因组中。因此,我们对ERV-K家族的内源性逆转录病毒序列进行了系统搜索,比较了人类(智人)和黑猩猩(黑猩猩)的基因组。以ERV-K的保守基序作为查询,我们鉴定并表征了76个完整的ERV-K元件,其中54个在人类中(HERV-K),其中34个先前已有描述,21个在黑猩猩中(CERV-K)。使用编码区和长末端重复序列(LTR)进行的系统发育分析显示存在两个主要分支。第一组是最具异质性的,平均整合时间为1830万年前(百万年前),使用的速率范围为1.5至4.0×10^(-9) 替换/位点/年(每年每个位点的替换数)。O/N组在约1940万年前整合,嵌套的N组在约1400万年前整合。我们发现了在gag、pol和env编码区存在强烈正选择以及A/T超突变的证据。我们的数据表明,内源性元件可能参与了染色体重排,并保留了其活跃阶段的大量信息,这很可能是宿主相互作用的结果。这项研究也有助于人类和黑猩猩基因组的注释工作。