Bodén Mikael, Bailey Timothy L
School of Information Technology and Electrical Engineering, QLD 4072, The University of Queensland Australia.
Bioinformatics. 2006 Aug 1;22(15):1809-14. doi: 10.1093/bioinformatics/btl198. Epub 2006 May 23.
Conformational flexibility is essential to the function of many proteins, e.g. catalytic activity. To assist efforts in determining and exploring the functional properties of a protein, it is desirable to automatically identify regions that are prone to undergo conformational changes. It was recently shown that a probabilistic predictor of continuum secondary structure is more accurate than categorical predictors for structurally ambivalent sequence regions, suggesting that such models are suited to characterize protein flexibility.
We develop a computational method for identifying regions that are prone to conformational change directly from the amino acid sequence. The method uses the entropy of the probabilistic output of an 8-class continuum secondary structure predictor. Results for 171 unique amino acid sequences with well-characterized variable structure (identified in the 'Macromolecular movements database') indicate that the method is highly sensitive at identifying flexible protein regions, but false positives remain a problem. The method can be used to explore conformational flexibility of proteins (including hypothetical or synthetic ones) whose structure is yet to be determined experimentally.
The predictor, sequence data and supplementary studies are available at http://pprowler.itee.uq.edu.au/sspred/ and are free for academic use.
构象灵活性对于许多蛋白质的功能至关重要,例如催化活性。为了辅助确定和探索蛋白质的功能特性,自动识别易于发生构象变化的区域是很有必要的。最近的研究表明,对于结构模糊的序列区域,连续二级结构的概率预测器比分类预测器更准确,这表明此类模型适合于表征蛋白质的灵活性。
我们开发了一种计算方法,可直接从氨基酸序列中识别易于发生构象变化的区域。该方法使用八类连续二级结构预测器的概率输出的熵。对171个具有明确可变结构的独特氨基酸序列(在“大分子运动数据库”中识别)的结果表明,该方法在识别柔性蛋白质区域方面具有高度敏感性,但假阳性仍然是一个问题。该方法可用于探索其结构尚未通过实验确定的蛋白质(包括假设的或合成的蛋白质)的构象灵活性。
预测器、序列数据和补充研究可在http://pprowler.itee.uq.edu.au/sspred/获取,供学术使用免费。