使用强大的计算预测工具，通过高性能评分指标研究瓜氨酸化位点的精确识别。

Investigating the Precise Identification of Citrullination Sites with High- Performance Score Metrics Using a Powerful Computation Predicting Tool.

作者信息

Ahmed Fee Faysal, Podder Anamika, Bulbul Md Farhad, Hossain Md Amzad, Hasan Mahedi, Sarkar Md Abdur Rauf, Kim Daijin

机构信息

Department of Mathematics, Jashore University of Science and Technology, Jashore, 7408, Bangladesh.

Department of Computer Science & Engineering, Pohang University of Science and Technology (POSTECH), 77 Cheongam, Pohang 37673, Korea.

出版信息

Comb Chem High Throughput Screen. 2024;27(9):1381-1393. doi: 10.2174/1386207326666230912151932.

DOI:10.2174/1386207326666230912151932

PMID:37702240

Abstract

BACKGROUND

To elucidate the detailed mechanisms of citrullination at the molecular level and design drugs applicable to major human diseases, predicting protein citrullination sites (PCSs) is essential. Using experimental approaches to predict PCSs is time-consuming and costly. However, there is a limited scope of the current PCS predictors. In particular, most predictors are commonly used for PCS prediction and have limited performance scores.

OBJECTIVE

This work aims to provide an improved sophisticated predictor of citrullination sites using a benchmark dataset in a machine learning platform.

METHODS

This study presents a reliable citrullination site predictor based on a benchmark dataset containing a 1:1 ratio of positive and negative samples. We classified citrullination sites using the Composition of the K-Spaced Amino Acid Pairs (CKSAAP) and Support Vector Machine (SVM).

RESULTS

We developed PCS predictors using integrated machine-learning methods that produced the highest average scores. Using 10-fold cross-validation on test datasets, the True Positive Rate (TPR) was 98.34%, the True Negative Rate (TNR) was 99.44%, the accuracy was 98.89%, the Mathew Correlation Coefficient (MCC) was 98.21%, the Area Under the ROC Curve (AUC) was 0.999, and the partial Area Under the ROC Curve (pAUC) was 0.1968.

CONCLUSION

According to overall performance, our developed predictor has a significantly higher implementation in comparison with the current tools on the same benchmark dataset. Moreover, it showed better performance metrics on both test and training datasets. Our developed predictor is promising and can be implemented as a complementary technique for identifying fast and precise citrullination sites.

摘要

背景

为了在分子水平上阐明瓜氨酸化的详细机制并设计适用于主要人类疾病的药物，预测蛋白质瓜氨酸化位点（PCSs）至关重要。使用实验方法预测PCSs既耗时又昂贵。然而，当前的PCSs预测器的范围有限。特别是，大多数预测器通常用于PCSs预测，并且性能得分有限。

目的

这项工作旨在使用机器学习平台中的基准数据集提供一种改进的复杂瓜氨酸化位点预测器。

方法

本研究基于包含正负样本1:1比例的基准数据集提出了一种可靠的瓜氨酸化位点预测器。我们使用K间隔氨基酸对组成（CKSAAP）和支持向量机（SVM）对瓜氨酸化位点进行分类。

结果

我们使用集成机器学习方法开发了PCSs预测器，该方法产生了最高的平均得分。在测试数据集上使用10折交叉验证，真阳性率（TPR）为98.34%，真阴性率（TNR）为99.44%，准确率为98.89%，马修相关系数（MCC）为98.21%，ROC曲线下面积（AUC）为0.999，ROC曲线部分面积（pAUC）为0.1968。

结论

根据整体性能，我们开发的预测器与同一基准数据集上的当前工具相比具有显著更高的实施效果。此外，它在测试和训练数据集上均表现出更好的性能指标。我们开发的预测器很有前景，可以作为一种识别快速准确的瓜氨酸化位点的补充技术来实施。

相似文献

Investigating the Precise Identification of Citrullination Sites with High- Performance Score Metrics Using a Powerful Computation Predicting Tool.使用强大的计算预测工具，通过高性能评分指标研究瓜氨酸化位点的精确识别。

Comb Chem High Throughput Screen. 2024;27(9):1381-1393. doi: 10.2174/1386207326666230912151932.

Prediction of citrullination sites by incorporating k-spaced amino acid pairs into Chou's general pseudo amino acid composition.通过将 k 间隔氨基酸对纳入周元的通用伪氨基酸组成来预测瓜氨酸化位点。

Gene. 2018 Jul 20;664:78-83. doi: 10.1016/j.gene.2018.04.055. Epub 2018 Apr 23.

Prediction of mucin-type O-glycosylation sites in mammalian proteins using the composition of k-spaced amino acid pairs.利用k间隔氨基酸对的组成预测哺乳动物蛋白质中的粘蛋白型O-糖基化位点

BMC Bioinformatics. 2008 Feb 18;9:101. doi: 10.1186/1471-2105-9-101.

hCKSAAP_UbSite: improved prediction of human ubiquitination sites by exploiting amino acid pattern and properties.hCKSAAP_UbSite：通过利用氨基酸模式和特性改进对人泛素化位点的预测。

Biochim Biophys Acta. 2013 Aug;1834(8):1461-7. doi: 10.1016/j.bbapap.2013.04.006. Epub 2013 Apr 19.

Prediction of serine phosphorylation sites mapping on Schizosaccharomyces Pombe by fusing three encoding schemes with the random forest classifier.通过将三种编码方案与随机森林分类器融合，预测丝氨酸磷酸化位点在裂殖酵母中的映射。

Sci Rep. 2022 Feb 16;12(1):2632. doi: 10.1038/s41598-022-06529-5.

Prediction of Citrullination Sites on the Basis of mRMR Method and SNN.基于mRMR方法和SNN的瓜氨酸化位点预测

Comb Chem High Throughput Screen. 2019;22(10):705-715. doi: 10.2174/1386207322666191129113508.

Prediction of ubiquitination sites by using the composition of k-spaced amino acid pairs.基于 k 间隔氨基酸对组成的泛素化位点预测。

PLoS One. 2011;6(7):e22930. doi: 10.1371/journal.pone.0022930. Epub 2011 Jul 29.

Improved Prediction of Protein-Protein Interaction Mapping on by Using Amino Acid Sequence Features in a Supervised Learning Framework.利用监督学习框架中的氨基酸序列特征改进蛋白质相互作用预测映射。

Protein Pept Lett. 2021;28(1):74-83. doi: 10.2174/0929866527666200610141258.

Predicting lysine phosphoglycerylation with fuzzy SVM by incorporating k-spaced amino acid pairs into Chou׳s general PseAAC.通过将k间隔氨基酸对纳入周氏广义伪氨基酸组成，利用模糊支持向量机预测赖氨酸磷酸甘油化。

J Theor Biol. 2016 May 21;397:145-50. doi: 10.1016/j.jtbi.2016.02.020. Epub 2016 Feb 22.

PredNTS: Improved and Robust Prediction of Nitrotyrosine Sites by Integrating Multiple Sequence Features.PredNTS：通过整合多种序列特征提高和增强对硝化酪氨酸位点的预测。

Int J Mol Sci. 2021 Mar 8;22(5):2704. doi: 10.3390/ijms22052704.

本文引用的文献

HLPpred-Fuse: improved and robust prediction of hemolytic peptide and its activity by fusing multiple feature representation.HLPpred-Fuse：通过融合多种特征表示提高和增强溶血肽及其活性的预测

Bioinformatics. 2020 Jun 1;36(11):3350-3356. doi: 10.1093/bioinformatics/btaa160.

Prediction of Citrullination Sites on the Basis of mRMR Method and SNN.基于mRMR方法和SNN的瓜氨酸化位点预测

Comb Chem High Throughput Screen. 2019;22(10):705-715. doi: 10.2174/1386207322666191129113508.

Citrullination in Cancer.瓜氨酸化在癌症中的作用。

Cancer Res. 2019 Apr 1;79(7):1274-1284. doi: 10.1158/0008-5472.CAN-18-2797. Epub 2019 Mar 20.

PreAIP: Computational Prediction of Anti-inflammatory Peptides by Integrating Multiple Complementary Features.PreAIP：通过整合多种互补特征对抗炎肽进行计算预测

Front Genet. 2019 Mar 5;10:129. doi: 10.3389/fgene.2019.00129. eCollection 2019.

Large-Scale Assessment of Bioinformatics Tools for Lysine Succinylation Sites.赖氨酸琥珀酰化位点的生物信息学工具的大规模评估。

Cells. 2019 Jan 28;8(2):95. doi: 10.3390/cells8020095.

i6mA-Pred: identifying DNA N6-methyladenine sites in the rice genome.i6mA-Pred：鉴定水稻基因组中的 DNA N6-甲基腺嘌呤位点。

Bioinformatics. 2019 Aug 15;35(16):2796-2800. doi: 10.1093/bioinformatics/btz015.

dbPTM in 2019: exploring disease association and cross-talk of post-translational modifications.dbPTM 于 2019 年：探索翻译后修饰的疾病关联和串扰。

Nucleic Acids Res. 2019 Jan 8;47(D1):D298-D308. doi: 10.1093/nar/gky1074.

A Comprehensive Review of In silico Analysis for Protein S-sulfenylation Sites.蛋白质S-亚磺酰化位点的计算机模拟分析综述

Protein Pept Lett. 2018;25(9):815-821. doi: 10.2174/0929866525666180905110619.

Gene. 2018 Jul 20;664:78-83. doi: 10.1016/j.gene.2018.04.055. Epub 2018 Apr 23.

Computational identification of protein S-sulfenylation sites by incorporating the multiple sequence features information.通过整合多序列特征信息对蛋白质S-亚磺酰化位点进行计算识别。

Mol Biosyst. 2017 Nov 21;13(12):2545-2550. doi: 10.1039/c7mb00491e.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

使用强大的计算预测工具，通过高性能评分指标研究瓜氨酸化位点的精确识别。

Investigating the Precise Identification of Citrullination Sites with High- Performance Score Metrics Using a Powerful Computation Predicting Tool.

作者信息

机构信息

出版信息

BACKGROUND

OBJECTIVE

METHODS

RESULTS

CONCLUSION

背景

目的

方法

结果

结论

相似文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

本文引用的文献