Illingworth Christopher J R, Parkes Kevin E, Snell Christopher R, Mullineaux Philip M, Reynolds Christopher A
Department of Biological Sciences, University of Essex, Wivenhoe Park, Colchester, CO4 3SQ, United Kingdom.
Biophys Chem. 2008 Mar;133(1-3):28-35. doi: 10.1016/j.bpc.2007.11.004. Epub 2007 Nov 22.
Methods to determine periodicity in protein sequences are useful for inferring function. Fourier transformation is one approach but care is required to ensure the periodicity is genuine. Here we have shown that empirically-derived statistical tables can be used as a measure of significance. Genuine protein sequences data rather than randomly generated sequences were used as the statistical backdrop. The method has been applied to G-protein coupled receptor (GPCR) sequences, by Fourier transformation of hydrophobicity values, codon frequencies and the extent of over-representation of codon pairs; the latter being related to translational step times. Genuine periodicity was observed in the hydrophobicity whereas the apparent periodicity (as inferred from previously reported measures) in the translation step times was not validated statistically. GCR2 has recently been proposed as the plant GPCR receptor for the hormone abscisic acid. It has homology to the Lanthionine synthetase C-like family of proteins, an observation confirmed by fold recognition. Application of the Fourier transform algorithm to the GCR2 family revealed strongly predicted seven fold periodicity in hydrophobicity, suggesting why GCR2 has been reported to be a GPCR, despite negative indications in most transmembrane prediction algorithms. The underlying multiple sequence alignment, also required for the Fourier transform analysis of periodicity, indicated that the hydrophobic regions around the 7 GXXG motifs commence near the C-terminal end of each of the 7 inner helices of the alpha-toroid and continue to the N-terminal region of the helix. The results clearly explain why GCR2 has been understandably but erroneously predicted to be a GPCR.
确定蛋白质序列周期性的方法对于推断其功能很有用。傅里叶变换是一种方法,但需要注意确保这种周期性是真实的。在这里,我们已经表明,根据经验得出的统计表可以用作显著性的一种度量。使用真实的蛋白质序列数据而非随机生成的序列作为统计背景。该方法已应用于G蛋白偶联受体(GPCR)序列,通过对疏水性值、密码子频率和密码子对的过度代表性程度进行傅里叶变换;后者与翻译步长时间有关。在疏水性方面观察到了真实的周期性,而在翻译步长时间方面的明显周期性(如根据先前报道的度量推断)在统计学上未得到验证。GCR2最近被提议作为植物中激素脱落酸的GPCR受体。它与羊毛硫氨酸合成酶C样蛋白家族具有同源性,这一观察结果通过折叠识别得到了证实。将傅里叶变换算法应用于GCR2家族,发现在疏水性方面有强烈预测的七重周期性,这表明了为什么尽管在大多数跨膜预测算法中有负面迹象,但GCR2仍被报道为一种GPCR。周期性傅里叶变换分析所需的基础多序列比对表明,7个GXXG基序周围的疏水区在α-环面的7个内部螺旋的每个螺旋的C末端附近开始,并延伸到螺旋的N末端区域。结果清楚地解释了为什么GCR2被错误地预测为GPCR,这是可以理解的。