Saini Harpreet Kaur, Fischer Daniel
Computer Science and Engineering Dept., University at Buffalo, Buffalo, NY 14260-2000, USA.
BMC Genomics. 2007 May 9;8:115. doi: 10.1186/1471-2164-8-115.
Mimivirus isolated from A. polyphaga is the largest virus discovered so far. It is unique among all the viruses in having genes related to translation, DNA repair and replication which bear close homology to eukaryotic genes. Nevertheless, only a small fraction of the proteins (33%) encoded in this genome has been assigned a function. Furthermore, a large fraction of the unassigned protein sequences bear no sequence similarity to proteins from other genomes. These sequences are referred to as ORFans. Because of their lack of sequence similarity to other proteins, they can not be assigned putative functions using standard sequence comparison methods. As part of our genome-wide computational efforts aimed at characterizing Mimivirus ORFans, we have applied fold-recognition methods to predict the structure of these ORFans and further functions were derived based on conservation of functionally important residues in sequence-template alignments.
Using fold recognition, we have identified highly confident computational 3D structural assignments for 21 Mimivirus ORFans. In addition, highly confident functional predictions for 6 of these ORFans were derived by analyzing the conservation of functional motifs between the predicted structures and proteins of known function. This analysis allowed us to classify these 6 previously unannotated ORFans into their specific protein families: carboxylesterase/thioesterase, metal-dependent deacetylase, P-loop kinases, 3-methyladenine DNA glycosylase, BTB domain and eukaryotic translation initiation factor eIF4E.
Using stringent fold recognition criteria we have assigned three-dimensional structures for 21 of the ORFans encoded in the Mimivirus genome. Further, based on the 3D models and an analysis of the conservation of functionally important residues and motifs, we were able to derive functional attributes for 6 of the ORFans. Our computational identification of important functional sites in these ORFans can be the basis for a subsequent experimental verification of our predictions. Further computational and experimental studies are required to elucidate the 3D structures and functions of the remaining Mimivirus ORFans.
从多食棘阿米巴(Acanthamoeba polyphaga)中分离出的米米病毒(Mimivirus)是迄今发现的最大病毒。在所有病毒中,它独一无二,拥有与翻译、DNA修复和复制相关的基因,这些基因与真核基因具有高度同源性。然而,该基因组中只有一小部分(33%)编码的蛋白质已被赋予功能。此外,很大一部分未被分配功能的蛋白质序列与其他基因组中的蛋白质没有序列相似性。这些序列被称为孤儿基因(ORFans)。由于它们与其他蛋白质缺乏序列相似性,无法使用标准序列比较方法来赋予其假定功能。作为我们旨在表征米米病毒孤儿基因的全基因组计算工作的一部分,我们应用折叠识别方法来预测这些孤儿基因的结构,并基于序列模板比对中功能重要残基的保守性推导出进一步的功能。
通过折叠识别,我们为21个米米病毒孤儿基因确定了高度可信的计算三维结构归属。此外,通过分析预测结构与已知功能蛋白质之间功能基序的保守性,对其中6个孤儿基因进行了高度可信的功能预测。该分析使我们能够将这6个先前未注释的孤儿基因归类到特定的蛋白质家族:羧酸酯酶/硫酯酶、金属依赖性脱乙酰酶、P环激酶、3-甲基腺嘌呤DNA糖基化酶、BTB结构域和真核翻译起始因子eIF4E。
使用严格的折叠识别标准,我们为米米病毒基因组中编码的21个孤儿基因确定了三维结构。此外,基于三维模型以及对功能重要残基和基序保守性的分析,我们能够推导出6个孤儿基因的功能属性。我们对这些孤儿基因中重要功能位点的计算识别可为后续对我们预测的实验验证奠定基础。还需要进一步的计算和实验研究来阐明其余米米病毒孤儿基因 的三维结构和功能。