Suppr超能文献

FRTpred:一种准确预测蛋白质折叠速度和类型的新方法。

FRTpred: A novel approach for accurate prediction of protein folding rate and type.

机构信息

Computational Biology and Bioinformatics Laboratory, Department of Integrative Biotechnology, College of Bioengineering and Biotechnology, Sungkyunkwan University, Suwon, 16419, South Korea.

School of Computational Sciences, Korea Institute for Advanced Study, Seoul, 02455, South Korea.

出版信息

Comput Biol Med. 2022 Oct;149:105911. doi: 10.1016/j.compbiomed.2022.105911. Epub 2022 Aug 26.

Abstract

Protein folding rate is an important property that is essential for understanding the protein folding process and is helpful for designing proteins. Predicting such properties from either sequence or structural information is a challenging task in bioinformatics. Although several computational methods have been developed in the past, only one sequence-based method is publicly available that shows limited accuracy when evaluated using a standardized independent dataset. This study proposes a novel approach, called FRTpred, that simultaneously predicts the logarithmic protein folding rate constant, ln(k), and folding type from the provided sequence. First, 30 baseline models (regression models for ln(k) and classification models for folding type) were constructed by integrating 10 representative feature extraction methods and three commonly used machine-learning algorithms. Subsequently, the predicted values of the 30 baseline models were combined and inputted into the random forest algorithm to construct the final prediction model. Cross-validation analysis showed that FRTpred achieved mean absolute deviations of 1.491, 2.016, and 1.954 for non-two-state, two-state, and combined models, respectively, when predicting ln(k). Moreover, FRTpred predicts the folding type with an accuracy of 0.843. Performance comparisons based on independent tests against existing methods showed that FRTpred is more precise for both ln(k) and folding type prediction. Thus, FRTpred is a powerful tool that may accelerate the characterization of the foldomics protein data and further inspire the development of next-generation predictors. The proposed model is available in the form of a web server that is freely accessible at http://thegleelab.org/FRTpred.

摘要

蛋白质折叠速率是一个重要的特性,对于理解蛋白质折叠过程至关重要,并且有助于设计蛋白质。从序列或结构信息预测这些特性是生物信息学中的一个具有挑战性的任务。尽管过去已经开发了几种计算方法,但只有一种基于序列的方法是公开可用的,该方法在使用标准化独立数据集进行评估时显示出有限的准确性。本研究提出了一种新方法,称为 FRTpred,它可以同时根据提供的序列预测对数蛋白折叠速率常数 ln(k)和折叠类型。首先,通过整合 10 种有代表性的特征提取方法和三种常用的机器学习算法,构建了 30 个基线模型(ln(k)的回归模型和折叠类型的分类模型)。然后,将 30 个基线模型的预测值进行组合,并输入随机森林算法中,构建最终的预测模型。交叉验证分析表明,FRTpred 在预测 ln(k)时,非二态、二态和组合模型的平均绝对偏差分别为 1.491、2.016 和 1.954。此外,FRTpred 预测折叠类型的准确率为 0.843。与现有方法的独立测试的性能比较表明,FRTpred 在 ln(k)和折叠类型预测方面更为精确。因此,FRTpred 是一种强大的工具,可能会加速蛋白质折叠组学数据的特征描述,并进一步激发下一代预测器的发展。该模型以网络服务器的形式提供,可在 http://thegleelab.org/FRTpred 上免费访问。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验