Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology, ul, Trojdena 4, 02-109, Warsaw, Poland.
BMC Bioinformatics. 2012 May 24;13:111. doi: 10.1186/1471-2105-13-111.
Intrinsically unstructured proteins (IUPs) lack a well-defined three-dimensional structure. Some of them may assume a locally stable structure under specific conditions, e.g. upon interaction with another molecule, while others function in a permanently unstructured state. The discovery of IUPs challenged the traditional protein structure paradigm, which stated that a specific well-defined structure defines the function of the protein. As of December 2011, approximately 60 methods for computational prediction of protein disorder from sequence have been made publicly available. They are based on different approaches, such as utilizing evolutionary information, energy functions, and various statistical and machine learning methods.
Given the diversity of existing intrinsic disorder prediction methods, we decided to test whether it is possible to combine them into a more accurate meta-prediction method. We developed a method based on arbitrarily chosen 13 disorder predictors, in which the final consensus was weighted by the accuracy of the methods. We have also developed a disorder predictor GSmetaDisorder3D that used no third-party disorder predictors, but alignments to known protein structures, reported by the protein fold-recognition methods, to infer the potentially structured and unstructured regions. Following the success of our disorder predictors in the CASP8 benchmark, we combined them into a meta-meta predictor called GSmetaDisorderMD, which was the top scoring method in the subsequent CASP9 benchmark.
A series of disorder predictors described in this article is available as a MetaDisorder web server at http://iimcb.genesilico.pl/metadisorder/. Results are presented both in an easily interpretable, interactive mode and in a simple text format suitable for machine processing.
本征无规蛋白质(IUP)缺乏明确的三维结构。它们中的一些可能在特定条件下(例如与另一个分子相互作用时)采用局部稳定的结构,而其他的则以永久无规状态发挥作用。IUP 的发现挑战了传统的蛋白质结构范例,该范例指出,特定的明确结构定义了蛋白质的功能。截至 2011 年 12 月,已经公开了大约 60 种从序列预测蛋白质无序的计算方法。它们基于不同的方法,例如利用进化信息、能量函数以及各种统计和机器学习方法。
鉴于现有内在无序预测方法的多样性,我们决定测试是否有可能将它们组合成一种更准确的元预测方法。我们开发了一种基于任意选择的 13 种无序预测器的方法,其中最终共识的权重由方法的准确性决定。我们还开发了一种无序预测器 GSmetaDisorder3D,它不使用第三方无序预测器,而是使用蛋白质折叠识别方法报告的已知蛋白质结构的比对,以推断潜在的结构和无规区域。在我们的无序预测器在 CASP8 基准测试中取得成功之后,我们将它们组合成一个称为 GSmetaDisorderMD 的元元预测器,它在随后的 CASP9 基准测试中得分最高。
本文描述的一系列无序预测器可作为 MetaDisorder 网络服务器在 http://iimcb.genesilico.pl/metadisorder/ 上获得。结果以易于解释的交互模式和适合机器处理的简单文本格式呈现。