Peterson-Burch Brooke D, Voytas Daniel F
Department of Zoology & Genetics, Iowa State University, Ames, 50011, USA.
Mol Biol Evol. 2002 Nov;19(11):1832-45. doi: 10.1093/oxfordjournals.molbev.a004008.
A comprehensive survey of the Pseudoviridae (Ty1/copia) retroelement family was conducted using the GenBank sequence database and completed genome sequences of several model organisms. Plant genomes were the most abundant sources of Pseudoviridae, with the Arabidopsis thaliana genome having 276 distinct elements. A reverse transcriptase amino acid sequence phylogeny indicated that the Pseudoviridae comprises highly divergent members. Coding sequences for a representative subset of elements were analyzed to identify conserved domains and differences that may underlie functional divergence. With the exception of some fungal elements (e.g., Ty1), most Pseudoviridae encode Gag and Pol on a single open reading frame. In addition to the nearly ubiquitous RNA-binding motif of nucleocapsid, three new conserved domains were identified in Gag. pol-encoded aspartic protease was similar to the retroviral enzyme and could be mapped onto the HIV-1 structure. Pol was highly conserved throughout the family. The greatest divergence among Pol sequences was seen in the C-terminus of integrase (IN). We defined a large motif (GKGY) after the IN catalytic domain that is unique to the Pseudoviridae. Additionally, the extreme C-terminus of IN is rich in simple sequence motifs. A distinct lineage of Pseudoviridae in plants have envlike genes. This lineage has undergone a large expansion of Gag characterized by an alpha-helix-rich domain containing coiled-coil motifs. In several elements, this domain is flanked on both sides by RNA-binding domains. We propose that this monophyletic lineage defines a new Pseudoviridae genus, herein referred to as the AGROVIRUS:
利用GenBank序列数据库和几种模式生物的完整基因组序列,对伪病毒科(Ty1/copia)反转录元件家族进行了全面的调查。植物基因组是伪病毒科最丰富的来源,拟南芥基因组中有276个不同的元件。逆转录酶氨基酸序列系统发育分析表明,伪病毒科包含高度分化的成员。对元件代表性子集的编码序列进行分析,以鉴定可能构成功能差异基础的保守结构域和差异。除了一些真菌元件(如Ty1)外,大多数伪病毒科在单个开放阅读框中编码Gag和Pol。除了几乎普遍存在的核衣壳RNA结合基序外,在Gag中还鉴定出三个新的保守结构域。pol编码的天冬氨酸蛋白酶与逆转录病毒酶相似,并且可以映射到HIV-1结构上。Pol在整个家族中高度保守。Pol序列之间最大的差异出现在整合酶(IN)的C末端。我们在IN催化结构域之后定义了一个大型基序(GKGY),这是伪病毒科特有的。此外,IN的极端C末端富含简单序列基序。植物中一个独特的伪病毒科谱系具有类env基因。这个谱系经历了Gag的大量扩增,其特征是含有卷曲螺旋基序的富含α螺旋的结构域。在几个元件中,这个结构域两侧都有RNA结合结构域。我们提出,这个单系谱系定义了一个新的伪病毒科属,在此称为农病毒属: