PredNTS：通过整合多种序列特征提高和增强对硝化酪氨酸位点的预测。

PredNTS: Improved and Robust Prediction of Nitrotyrosine Sites by Integrating Multiple Sequence Features.

机构信息

Department of Bioscience and Bioinformatics, Kyushu Institute of Technology, 680-4 Kawazu, Iizuka, Fukuoka 820-8502, Japan.

WHO Collaborating Centre on eHealth, UNSW Digital Health, School of Public Health and Community Medicine, Faculty of Medicine, UNSW Sydney, Sydney, NSW 2052, Australia.

出版信息

Int J Mol Sci. 2021 Mar 8;22(5):2704. doi: 10.3390/ijms22052704.

DOI:10.3390/ijms22052704

PMID:33800121

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7962192/

Abstract

Nitrotyrosine, which is generated by numerous reactive nitrogen species, is a type of protein post-translational modification. Identification of site-specific nitration modification on tyrosine is a prerequisite to understanding the molecular function of nitrated proteins. Thanks to the progress of machine learning, computational prediction can play a vital role before the biological experimentation. Herein, we developed a computational predictor PredNTS by integrating multiple sequence features including K-mer, composition of k-spaced amino acid pairs (CKSAAP), AAindex, and binary encoding schemes. The important features were selected by the recursive feature elimination approach using a random forest classifier. Finally, we linearly combined the successive random forest (RF) probability scores generated by the different, single encoding-employing RF models. The resultant PredNTS predictor achieved an area under a curve (AUC) of 0.910 using five-fold cross validation. It outperformed the existing predictors on a comprehensive and independent dataset. Furthermore, we investigated several machine learning algorithms to demonstrate the superiority of the employed RF algorithm. The PredNTS is a useful computational resource for the prediction of nitrotyrosine sites. The web-application with the curated datasets of the PredNTS is publicly available.

摘要

硝酪氨酸是由多种活性氮物种生成的一种蛋白质翻译后修饰类型。鉴定酪氨酸的特异性硝化修饰位点是理解硝化蛋白分子功能的前提。得益于机器学习的进展，计算预测可以在生物学实验之前发挥至关重要的作用。在此，我们通过整合多种序列特征，包括 K -mer、k 间隔氨基酸对组成（CKSAAP）、AAindex 和二进制编码方案，开发了一种名为 PredNTS 的计算预测器。通过使用随机森林分类器的递归特征消除方法选择重要特征。最后，我们线性组合了由不同的、单一编码的随机森林模型生成的连续随机森林（RF）概率得分。该预测器在五重交叉验证中实现了 0.910 的曲线下面积（AUC）。它在综合和独立数据集上优于现有的预测器。此外，我们研究了几种机器学习算法，以证明所采用的 RF 算法的优越性。PredNTS 是预测硝酪氨酸位点的有用计算资源。带有 PredNTS 经过整理的数据集的网络应用程序可供公开使用。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4f5b/7962192/218bb548ab85/ijms-22-02704-g001.jpg

相似文献

PredNTS: Improved and Robust Prediction of Nitrotyrosine Sites by Integrating Multiple Sequence Features.PredNTS：通过整合多种序列特征提高和增强对硝化酪氨酸位点的预测。

Int J Mol Sci. 2021 Mar 8;22(5):2704. doi: 10.3390/ijms22052704.

NTpred: a robust and precise machine learning framework for in silico identification of Tyrosine nitration sites in protein sequences.NTpred：一种用于在蛋白质序列中通过计算机模拟鉴定酪氨酸硝化位点的强大且精确的机器学习框架。

Brief Funct Genomics. 2024 Mar 20;23(2):163-179. doi: 10.1093/bfgp/elad018.

NTyroSite: Computational Identification of Protein Nitrotyrosine Sites Using Sequence Evolutionary Features.NTyroSite：利用序列进化特征计算鉴定蛋白质硝基酪氨酸位点。

Molecules. 2018 Jul 9;23(7):1667. doi: 10.3390/molecules23071667.

UbNiRF: A Hybrid Framework Based on Null Importances and Random Forest that Combines Multiple Features to Predict Ubiquitination Sites in and .UbNiRF：一种基于空重要性和随机森林的混合框架，它结合多种特征来预测[具体内容缺失]中的泛素化位点。

Front Biosci (Landmark Ed). 2024 May 21;29(5):197. doi: 10.31083/j.fbl2905197.

hCKSAAP_UbSite: improved prediction of human ubiquitination sites by exploiting amino acid pattern and properties.hCKSAAP_UbSite：通过利用氨基酸模式和特性改进对人泛素化位点的预测。

Biochim Biophys Acta. 2013 Aug;1834(8):1461-7. doi: 10.1016/j.bbapap.2013.04.006. Epub 2013 Apr 19.

Improved Prediction of Protein-Protein Interaction Mapping on by Using Amino Acid Sequence Features in a Supervised Learning Framework.利用监督学习框架中的氨基酸序列特征改进蛋白质相互作用预测映射。

Protein Pept Lett. 2021;28(1):74-83. doi: 10.2174/0929866527666200610141258.

Prediction of S-sulfenylation sites using mRMR feature selection and fuzzy support vector machine algorithm.基于 mRMR 特征选择和模糊支持向量机算法的 S-亚磺化位点预测。

J Theor Biol. 2018 Nov 14;457:6-13. doi: 10.1016/j.jtbi.2018.08.022. Epub 2018 Aug 18.

An Improved Computational Prediction Model for Lysine Succinylation Sites Mapping on Fusing Three Sequence Encoding Schemes with the Random Forest Classifier.一种改进的计算预测模型，用于通过融合三种序列编码方案与随机森林分类器来映射赖氨酸琥珀酰化位点

Curr Genomics. 2021 Feb;22(2):122-136. doi: 10.2174/1389202922666210219114211.

Computational methods for ubiquitination site prediction using physicochemical properties of protein sequences.利用蛋白质序列的物理化学性质进行泛素化位点预测的计算方法。

BMC Bioinformatics. 2016 Mar 3;17:116. doi: 10.1186/s12859-016-0959-z.

Stacking based ensemble learning framework for identification of nitrotyrosine sites.基于堆叠的集成学习框架用于鉴定硝化酪氨酸位点。

Comput Biol Med. 2024 Dec;183:109200. doi: 10.1016/j.compbiomed.2024.109200. Epub 2024 Oct 3.

引用本文的文献

PMTPred: machine-learning-based prediction of protein methyltransferases using the composition of k-spaced amino acid pairs.PMTPred：基于k间隔氨基酸对组成的蛋白质甲基转移酶的机器学习预测

Mol Divers. 2024 Aug;28(4):2301-2315. doi: 10.1007/s11030-024-10937-2. Epub 2024 Jul 21.

A metabolic perspective on nitric oxide function in melanoma.从代谢角度看黑色素瘤中的一氧化氮功能。

Biochim Biophys Acta Rev Cancer. 2024 Jan;1879(1):189038. doi: 10.1016/j.bbcan.2023.189038. Epub 2023 Dec 5.

Progress of the "Molecular Informatics" Section in 2022.2022 年“分子信息学”分会进展情况。

Int J Mol Sci. 2023 May 29;24(11):9442. doi: 10.3390/ijms24119442.

Editorial of Special Issue "Deep Learning and Machine Learning in Bioinformatics".专刊编辑寄语：深度学习与生物信息学中的机器学习

Int J Mol Sci. 2022 Jun 14;23(12):6610. doi: 10.3390/ijms23126610.

Protein Tyrosine Nitration in Plant Nitric Oxide Signaling.植物一氧化氮信号传导中的蛋白质酪氨酸硝化作用

Front Plant Sci. 2022 Mar 11;13:859374. doi: 10.3389/fpls.2022.859374. eCollection 2022.

STALLION: a stacking-based ensemble learning framework for prokaryotic lysine acetylation site prediction.STALLION：一种基于堆叠的集成学习框架，用于预测细菌赖氨酸乙酰化位点。

Brief Bioinform. 2022 Jan 17;23(1). doi: 10.1093/bib/bbab376.

本文引用的文献

Curr Genomics. 2021 Feb;22(2):122-136. doi: 10.2174/1389202922666210219114211.

PUP-Fuse: Prediction of Protein Pupylation Sites by Integrating Multiple Sequence Representations.PUP-Fuse：通过整合多种序列表示来预测蛋白泛素化位点。

Int J Mol Sci. 2021 Feb 20;22(4):2120. doi: 10.3390/ijms22042120.

BERT4Bitter: a bidirectional encoder representations from transformers (BERT)-based model for improving the prediction of bitter peptides.BERT4Bitter：一种基于变换器双向编码器表征（BERT）的模型，用于改进苦味肽的预测。

Bioinformatics. 2021 Sep 9;37(17):2556-2562. doi: 10.1093/bioinformatics/btab133.

Improved prediction and characterization of anticancer activities of peptides using a novel flexible scoring card method.利用新型灵活评分卡方法提高肽类抗癌活性的预测和表征。

Sci Rep. 2021 Feb 4;11(1):3017. doi: 10.1038/s41598-021-82513-9.

Critical evaluation of web-based DNA N6-methyladenine site prediction tools.基于网络的 DNA N6-甲基腺嘌呤位点预测工具的批判性评估。

Brief Funct Genomics. 2021 Jul 17;20(4):258-272. doi: 10.1093/bfgp/elaa028.

IRC-Fuse: improved and robust prediction of redox-sensitive cysteine by fusing of multiple feature representations.IRC-Fuse：通过融合多种特征表示改进并稳健预测氧化还原敏感型半胱氨酸

J Comput Aided Mol Des. 2021 Mar;35(3):315-323. doi: 10.1007/s10822-020-00368-0. Epub 2021 Jan 4.

Empirical Comparison and Analysis of Web-Based DNA -Methylcytosine Site Prediction Tools.基于网络的DNA甲基胞嘧啶位点预测工具的实证比较与分析

Mol Ther Nucleic Acids. 2020 Sep 16;22:406-420. doi: 10.1016/j.omtn.2020.09.010. eCollection 2020 Dec 4.

Computational prediction and interpretation of cell-specific replication origin sites from multiple eukaryotes by exploiting stacking framework.利用堆积框架从多种真核生物中计算预测和解释细胞特异性复制起始位点。

Brief Bioinform. 2021 Jul 20;22(4). doi: 10.1093/bib/bbaa275.

iLBE for Computational Identification of Linear B-cell Epitopes by Integrating Sequence and Evolutionary Features.iLBE：通过整合序列和进化特征计算识别线性 B 细胞表位的方法。

Genomics Proteomics Bioinformatics. 2020 Oct;18(5):593-600. doi: 10.1016/j.gpb.2019.04.004. Epub 2020 Oct 22.

iUmami-SCM: A Novel Sequence-Based Predictor for Prediction and Analysis of Umami Peptides Using a Scoring Card Method with Propensity Scores of Dipeptides.iUmami-SCM：一种新颖的基于序列的预测器，用于使用基于二肽倾向分数的评分卡方法预测和分析鲜味肽。

J Chem Inf Model. 2020 Dec 28;60(12):6666-6678. doi: 10.1021/acs.jcim.0c00707. Epub 2020 Oct 23.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

PredNTS：通过整合多种序列特征提高和增强对硝化酪氨酸位点的预测。

PredNTS: Improved and Robust Prediction of Nitrotyrosine Sites by Integrating Multiple Sequence Features.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献