Suppr超能文献

使用强大的计算预测工具,通过高性能评分指标研究瓜氨酸化位点的精确识别。

Investigating the Precise Identification of Citrullination Sites with High- Performance Score Metrics Using a Powerful Computation Predicting Tool.

作者信息

Ahmed Fee Faysal, Podder Anamika, Bulbul Md Farhad, Hossain Md Amzad, Hasan Mahedi, Sarkar Md Abdur Rauf, Kim Daijin

机构信息

Department of Mathematics, Jashore University of Science and Technology, Jashore, 7408, Bangladesh.

Department of Computer Science & Engineering, Pohang University of Science and Technology (POSTECH), 77 Cheongam, Pohang 37673, Korea.

出版信息

Comb Chem High Throughput Screen. 2024;27(9):1381-1393. doi: 10.2174/1386207326666230912151932.

Abstract

BACKGROUND

To elucidate the detailed mechanisms of citrullination at the molecular level and design drugs applicable to major human diseases, predicting protein citrullination sites (PCSs) is essential. Using experimental approaches to predict PCSs is time-consuming and costly. However, there is a limited scope of the current PCS predictors. In particular, most predictors are commonly used for PCS prediction and have limited performance scores.

OBJECTIVE

This work aims to provide an improved sophisticated predictor of citrullination sites using a benchmark dataset in a machine learning platform.

METHODS

This study presents a reliable citrullination site predictor based on a benchmark dataset containing a 1:1 ratio of positive and negative samples. We classified citrullination sites using the Composition of the K-Spaced Amino Acid Pairs (CKSAAP) and Support Vector Machine (SVM).

RESULTS

We developed PCS predictors using integrated machine-learning methods that produced the highest average scores. Using 10-fold cross-validation on test datasets, the True Positive Rate (TPR) was 98.34%, the True Negative Rate (TNR) was 99.44%, the accuracy was 98.89%, the Mathew Correlation Coefficient (MCC) was 98.21%, the Area Under the ROC Curve (AUC) was 0.999, and the partial Area Under the ROC Curve (pAUC) was 0.1968.

CONCLUSION

According to overall performance, our developed predictor has a significantly higher implementation in comparison with the current tools on the same benchmark dataset. Moreover, it showed better performance metrics on both test and training datasets. Our developed predictor is promising and can be implemented as a complementary technique for identifying fast and precise citrullination sites.

摘要

背景

为了在分子水平上阐明瓜氨酸化的详细机制并设计适用于主要人类疾病的药物,预测蛋白质瓜氨酸化位点(PCSs)至关重要。使用实验方法预测PCSs既耗时又昂贵。然而,当前的PCSs预测器的范围有限。特别是,大多数预测器通常用于PCSs预测,并且性能得分有限。

目的

这项工作旨在使用机器学习平台中的基准数据集提供一种改进的复杂瓜氨酸化位点预测器。

方法

本研究基于包含正负样本1:1比例的基准数据集提出了一种可靠的瓜氨酸化位点预测器。我们使用K间隔氨基酸对组成(CKSAAP)和支持向量机(SVM)对瓜氨酸化位点进行分类。

结果

我们使用集成机器学习方法开发了PCSs预测器,该方法产生了最高的平均得分。在测试数据集上使用10折交叉验证,真阳性率(TPR)为98.34%,真阴性率(TNR)为99.44%,准确率为98.89%,马修相关系数(MCC)为98.21%,ROC曲线下面积(AUC)为0.999,ROC曲线部分面积(pAUC)为0.1968。

结论

根据整体性能,我们开发的预测器与同一基准数据集上的当前工具相比具有显著更高的实施效果。此外,它在测试和训练数据集上均表现出更好的性能指标。我们开发的预测器很有前景,可以作为一种识别快速准确的瓜氨酸化位点的补充技术来实施。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验