Hemalatha Golaconda, Kishore Inampudi Krishna, Rao Raghavarapu Srinivasa, Guruprasad Lalitha
School of Chemistry, University of Hyderabad, Hyderabad 500046, India.
Protein Pept Lett. 2007;14(7):692-7. doi: 10.2174/092986607781483903.
We have identified four repeats and five domains that are novel in proteins encoded by the Pyrobaculum aerophilum str. IM2 proteome using automated in silico methods. A "repeat" corresponds to a region comprising less than 55 amino acid residues that occurs more than once in the protein sequence and sometimes present in tandem. A "domain" corresponds to a conserved region comprising greater than 55 amino acid residues and may be present as single or multiple copies in the protein sequence. These correspond to (1) 85 amino acid residues AAG domain, (2) 72 amino acid residues GFGN domain, (3) 43 amino acid residues KGG repeat, (4) 25 amino acid residues RWE repeat, (5) 25 amino acid residues RID repeat, (6) 108 amino acid residues NDFA domain, (7) 140 amino acid residues VxY domain, (8) 35 amino acid residues LLPN repeat and (9) 98 amino acid residues GxY domain. A repeat or domain is characterized by specific conserved sequence motifs. We discuss the presence of these repeats and domains in proteins from other genomes and their probable secondary structure.
我们利用自动化的计算机方法,在嗜气栖热菌(Pyrobaculum aerophilum)str. IM2蛋白质组编码的蛋白质中鉴定出了4个重复序列和5个结构域,这些都是新发现的。一个“重复序列”对应于蛋白质序列中出现不止一次且有时呈串联形式的、包含少于55个氨基酸残基的区域。一个“结构域”对应于包含超过55个氨基酸残基的保守区域,在蛋白质序列中可能以单拷贝或多拷贝形式存在。它们分别是:(1) 85个氨基酸残基的AAG结构域,(2) 72个氨基酸残基的GFGN结构域,(3) 43个氨基酸残基的KGG重复序列,(4) 25个氨基酸残基的RWE重复序列,(5) 25个氨基酸残基的RID重复序列,(6) 108个氨基酸残基的NDFA结构域,(7) 140个氨基酸残基的VxY结构域,(8) 35个氨基酸残基的LLPN重复序列,以及(9) 98个氨基酸残基的GxY结构域。一个重复序列或结构域由特定的保守序列基序所表征。我们讨论了这些重复序列和结构域在其他基因组蛋白质中的存在情况及其可能的二级结构。