Suppr超能文献

使用电荷-亲水性图改进蛋白质有序-无序分类

Improving protein order-disorder classification using charge-hydropathy plots.

作者信息

Huang Fei, Oldfield Christopher J, Xue Bin, Hsu Wei-Lun, Meng Jingwei, Liu Xiaowen, Shen Li, Romero Pedro, Uversky Vladimir N, Dunker A

出版信息

BMC Bioinformatics. 2014;15 Suppl 17(Suppl 17):S4. doi: 10.1186/1471-2105-15-S17-S4. Epub 2014 Dec 16.

Abstract

BACKGROUND

The earliest whole protein order/disorder predictor (Uversky et al., Proteins, 41: 415-427 (2000)), herein called the charge-hydropathy (C-H) plot, was originally developed using the Kyte-Doolittle (1982) hydropathy scale (Kyte & Doolittle., J. Mol. Biol, 157: 105-132(1982)). Here the goal is to determine whether the performance of the C-H plot in separating structured and disordered proteins can be improved by using an alternative hydropathy scale.

RESULTS

Using the performance of the CH-plot as the metric, we compared 19 alternative hydropathy scales, with the finding that the Guy (1985) hydropathy scale (Guy, Biophys. J, 47:61-70(1985)) was the best of the tested hydropathy scales for separating large collections structured proteins and intrinsically disordered proteins (IDPs) on the C-H plot. Next, we developed a new scale, named IDP-Hydropathy, which further improves the discrimination between structured proteins and IDPs. Applying the C-H plot to a dataset containing 109 IDPs and 563 non-homologous fully structured proteins, the Kyte-Doolittle (1982) hydropathy scale, the Guy (1985) hydropathy scale, and the IDP-Hydropathy scale gave balanced two-state classification accuracies of 79%, 84%, and 90%, respectively, indicating a very substantial overall improvement is obtained by using different hydropathy scales. A correlation study shows that IDP-Hydropathy is strongly correlated with other hydropathy scales, thus suggesting that IDP-Hydropathy probably has only minor contributions from amino acid properties other than hydropathy.

CONCLUSION

We suggest that IDP-Hydropathy would likely be the best scale to use for any type of algorithm developed to predict protein disorder.

摘要

背景

最早的全蛋白质有序/无序预测器(乌韦尔斯基等人,《蛋白质》,41: 415 - 427 (2000)),本文称为电荷 - 亲水性(C - H)图,最初是使用凯泰 - 杜利特尔(1982)亲水性标度(凯泰与杜利特尔,《分子生物学杂志》,157: 105 - 132(1982))开发的。此处的目标是确定通过使用替代亲水性标度,C - H图在区分结构化和无序蛋白质方面的性能是否能够得到改善。

结果

以C - H图的性能作为衡量标准,我们比较了19种替代亲水性标度,发现盖伊(1985)亲水性标度(盖伊,《生物物理学杂志》,47:61 - 70(1985))是在C - H图上区分大量结构化蛋白质和内在无序蛋白质(IDP)的测试亲水性标度中最佳的。接下来,我们开发了一种名为IDP - 亲水性的新标度,它进一步提高了对结构化蛋白质和IDP的区分能力。将C - H图应用于包含109种IDP和563种非同源完全结构化蛋白质的数据集时,凯泰 - 杜利特尔(1982)亲水性标度、盖伊(1985)亲水性标度和IDP - 亲水性标度给出的平衡二态分类准确率分别为79%、84%和90%,这表明使用不同亲水性标度能带来非常显著的总体改善。相关性研究表明,IDP - 亲水性与其他亲水性标度高度相关,因此表明IDP - 亲水性可能仅在除亲水性之外的氨基酸特性方面有较小贡献。

结论

我们认为IDP - 亲水性可能是用于开发任何预测蛋白质无序算法的最佳标度。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4a55/4304195/cc1b45adb670/1471-2105-15-S17-S4-1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验