Gatherer Derek, McEwan Neil R
Drug Design, RiboTargets Ltd., Granta Park, Cambridge CB1 6GB, UK.
J Mol Evol. 2003 Aug;57(2):149-58. doi: 10.1007/s00239-002-2462-1.
Periodicity was quantified in 4289 Escherichia coli K12 confirmed and putative protein sequences, using a simple chi-square technique previously shown to reveal triplet period periodicity in coding DNA. Periodicities were calculated from period n = 2 to period n = 50 in nine different alphabetic representations of the proteins. By comparison with a randomly generated proteome of the same compositional content, the E. coli proteome does not contain a significant excess of periodic proteins. However, 60 proteins do appear to be significantly periodic in at least one alphabetic representation, after Bonferroni correction, at p < 0.01, and 30 at p < 0.001. These are compared with significantly periodic proteins of solved three-dimensional structure, detected by an identical analysis of the sequences from a protein structure database. It is concluded that there is no evidence for the presence of a proteome-wide quasi-periodicity as predicted by the "duplication and divergence" model of protein evolution and that the major periodicity detected is a consequence of the repetitive tendencies within alpha-helices. However, it is not possible to explain all sequence periodicities in terms of observable secondary structure, as in cases where sequence periodicity can be compared to solved structure, there is often no structural regularity that would provide an obvious explanation in terms of natural selection on protein function.
利用先前已证明能揭示编码DNA中三联体周期周期性的简单卡方技术,对4289个已确认的和推测的大肠杆菌K12蛋白质序列的周期性进行了量化。在蛋白质的九种不同字母表示形式中,计算了从周期n = 2到周期n = 50的周期性。通过与具有相同组成内容的随机生成的蛋白质组进行比较,大肠杆菌蛋白质组中不存在明显过量的周期性蛋白质。然而,经过Bonferroni校正后,在p < 0.01时,有60种蛋白质在至少一种字母表示形式中似乎具有显著的周期性,在p < 0.001时有30种。将这些与通过对蛋白质结构数据库中的序列进行相同分析检测到的具有已解析三维结构的显著周期性蛋白质进行了比较。得出的结论是,没有证据表明存在蛋白质进化的“复制和分化”模型所预测的全蛋白质组范围的准周期性,并且检测到的主要周期性是α螺旋内重复倾向的结果。然而,不可能根据可观察到的二级结构来解释所有序列周期性,因为在序列周期性可与已解析结构进行比较的情况下,通常没有结构规则能从对蛋白质功能的自然选择方面提供明显的解释。