Department of Chemistry and Biochemistry , Ohio State University , Columbus , Ohio 43210 , United States.
J Phys Chem B. 2018 Apr 12;122(14):3920-3930. doi: 10.1021/acs.jpcb.8b01763. Epub 2018 Mar 29.
Although many proteins necessitate well-folded structures to properly instigate their biological functions, a large fraction of functioning proteins contain regions-known as intrinsically disordered protein regions-where stable structures are not likely to form. Notable functional roles of intrinsically disordered proteins are in transcriptional regulation, translation, and cellular signal transduction. Moreover, intrinsically disordered protein regions are highly abundant in many proteins associated with various human diseases, therefore these segments have become attractive drug targets for potential therapeutics. Over the past decades, numerous computational methods have been developed to accurately predict disordered regions of proteins. Here we introduce a user-friendly and reliable approach for the prediction of disordered protein regions using the structure prediction software Rosetta. Using 245 proteins from a benchmark data set (16 DisProt database proteins) and a test data set (229 proteins with NMR data), we use Rosetta to predict the global protein structures and then show that there is a statistically significant difference between Rosetta scores in disordered and ordered regions, with scores being less favorable in disordered regions. Furthermore, the difference in scores between ordered and disordered protein regions is sufficient to accurately identify disordered protein regions. As a result, our Rosetta ResidueDisorder method (benchmark data set prediction accuracy of 71.77% and independent test data set prediction accuracy of 65.37%) outperformed other established disorder prediction tools and did not exhibit a biased prediction toward either ordered or disordered regions. To facilitate usage, a Rosetta application has been developed for the Rosetta ResidueDisorder method.
虽然许多蛋白质需要正确折叠的结构才能发挥其生物功能,但很大一部分具有功能的蛋白质含有区域,称为固有无序蛋白质区域,这些区域不太可能形成稳定的结构。固有无序蛋白质的显著功能作用是在转录调控、翻译和细胞信号转导中。此外,固有无序蛋白质区域在许多与各种人类疾病相关的蛋白质中高度丰富,因此这些片段已成为有吸引力的潜在治疗药物靶点。在过去的几十年中,已经开发出许多计算方法来准确预测蛋白质的无序区域。在这里,我们使用结构预测软件 Rosetta 介绍了一种用于预测无序蛋白质区域的用户友好且可靠的方法。我们使用来自基准数据集(16 个 DisProt 数据库蛋白质)和测试数据集(229 个具有 NMR 数据的蛋白质)的 245 个蛋白质,使用 Rosetta 预测全局蛋白质结构,然后表明无序区域和有序区域的 Rosetta 分数之间存在统计学上显著的差异,无序区域的分数较差。此外,有序和无序蛋白质区域之间的分数差异足以准确识别无序蛋白质区域。结果,我们的 Rosetta ResidueDisorder 方法(基准数据集预测准确率为 71.77%,独立测试数据集预测准确率为 65.37%)优于其他已建立的无序预测工具,并且没有表现出对有序或无序区域的偏向预测。为了方便使用,我们为 Rosetta ResidueDisorder 方法开发了一个 Rosetta 应用程序。