Suppr超能文献

predCar-site:使用支持向量机预测蛋白质中的羰基化位点并解决数据不平衡问题。

predCar-site: Carbonylation sites prediction in proteins using support vector machine with resolving data imbalanced issue.

作者信息

Hasan Md Al Mehedi, Li Jinyan, Ahmad Shamim, Molla Md Khademul Islam

机构信息

Department of Computer Science & Engineering, University of Rajshahi, Bangladesh.

Advanced Analytics Institute and Centre for Health Technologies, University of Technology Sydney, Australia.

出版信息

Anal Biochem. 2017 May 15;525:107-113. doi: 10.1016/j.ab.2017.03.008. Epub 2017 Mar 9.

Abstract

The carbonylation is found as an irreversible post-translational modification and considered a biomarker of oxidative stress. It plays major role not only in orchestrating various biological processes but also associated with some diseases such as Alzheimer's disease, diabetes, and Parkinson's disease. However, since the experimental technologies are costly and time-consuming to detect the carbonylation sites in proteins, an accurate computational method for predicting carbonylation sites is an urgent issue which can be useful for drug development. In this study, a novel computational tool termed predCar-Site has been developed to predict protein carbonylation sites by (1) incorporating the sequence-coupled information into the general pseudo amino acid composition, (2) balancing the effect of skewed training dataset by Different Error Costs method, and (3) constructing a predictor using support vector machine as classifier. This predCar-Site predictor achieves an average AUC (area under curve) score of 0.9959, 0.9999, 1, and 0.9997 in predicting the carbonylation sites of K, P, R, and T, respectively. All of the experimental results along with AUC are found from the average of 5 complete runs of the 10-fold cross-validation and those results indicate significantly better performance than existing predictors. A user-friendly web server of predCar-Site is available at http://research.ru.ac.bd/predCar-Site/.

摘要

羰基化是一种不可逆的翻译后修饰,被认为是氧化应激的生物标志物。它不仅在协调各种生物过程中起主要作用,还与一些疾病如阿尔茨海默病、糖尿病和帕金森病有关。然而,由于检测蛋白质中羰基化位点的实验技术成本高且耗时,因此开发一种准确的预测羰基化位点的计算方法是一个紧迫的问题,这对药物开发可能有用。在本研究中,开发了一种名为predCar-Site的新型计算工具,通过以下方式预测蛋白质羰基化位点:(1) 将序列耦合信息纳入通用伪氨基酸组成;(2) 通过不同误差成本方法平衡倾斜训练数据集的影响;(3) 使用支持向量机作为分类器构建预测器。该predCar-Site预测器在预测K、P、R和T的羰基化位点时,平均AUC(曲线下面积)得分分别为0.9959、0.9999、1和0.9997。所有实验结果以及AUC均来自10折交叉验证的5次完整运行的平均值,这些结果表明其性能明显优于现有预测器。可通过http://research.ru.ac.bd/predCar-Site/访问用户友好的predCar-Site网络服务器。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验