一种用于高效预测琥珀酰化位点的混合特征提取方案。

A hybrid feature extraction scheme for efficient malonylation site prediction.

机构信息

Department of Computer Engineering, University of Science and Technology of Mazandaran, Behshahr, Iran.

Department of Computer Engineering, Faculty of Information Technology, Kermanshah University of Technology, Kermanshah, Iran.

出版信息

Sci Rep. 2022 Apr 6;12(1):5756. doi: 10.1038/s41598-022-08555-9.

DOI:10.1038/s41598-022-08555-9

PMID:35388017

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8987080/

Abstract

Lysine malonylation is one of the most important post-translational modifications (PTMs). It affects the functionality of cells. Malonylation site prediction in proteins can unfold the mechanisms of cellular functionalities. Experimental methods are one of the due prediction approaches. But they are typically costly and time-consuming to implement. Recently, methods based on machine-learning solutions have been proposed to tackle this problem. Such practices have been shown to reduce costs and time complexities and increase accuracy. However, these approaches also have specific shortcomings, including inappropriate feature extraction out of protein sequences, high-dimensional features, and inefficient underlying classifiers. A machine learning-based method is proposed in this paper to cope with these problems. In the proposed approach, seven different features are extracted. Then, the extracted features are combined, ranked based on the Fisher's score (F-score), and the most efficient ones are selected. Afterward, malonylation sites are predicted using various classifiers. Simulation results show that the proposed method has acceptable performance compared with some state-of-the-art approaches. In addition, the XGBOOST classifier, founded on extracted features such as TFCRF, has a higher prediction rate than the other methods. The codes are publicly available at: https://github.com/jimy2020/Malonylation-site-prediction.

摘要

赖氨酸丙二酰化是最重要的翻译后修饰（PTMs）之一。它影响细胞的功能。蛋白质中丙二酰化位点的预测可以揭示细胞功能的机制。实验方法是一种主要的预测方法。但它们通常实施成本高、耗时。最近，提出了基于机器学习解决方案的方法来解决这个问题。这些方法已经被证明可以降低成本和时间复杂度，并提高准确性。然而，这些方法也有特定的缺点，包括从蛋白质序列中提取不合适的特征、高维特征和低效的基础分类器。本文提出了一种基于机器学习的方法来解决这些问题。在提出的方法中，提取了七种不同的特征。然后，将提取的特征进行组合，根据 Fisher 得分（F 得分）进行排序，并选择最有效的特征。然后，使用各种分类器预测丙二酰化位点。模拟结果表明，与一些最先进的方法相比，该方法具有可接受的性能。此外，基于 TFCRF 等提取特征的 XGBOOST 分类器的预测率高于其他方法。代码可在：https://github.com/jimy2020/Malonylation-site-prediction 上获得。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2f5b/8987080/5e0d14932b13/41598_2022_8555_Fig1_HTML.jpg

相似文献

A hybrid feature extraction scheme for efficient malonylation site prediction.一种用于高效预测琥珀酰化位点的混合特征提取方案。

Sci Rep. 2022 Apr 6;12(1):5756. doi: 10.1038/s41598-022-08555-9.

Analysis and review of techniques and tools based on machine learning and deep learning for prediction of lysine malonylation sites in protein sequences.基于机器学习和深度学习的赖氨酸丙二酰化位点预测的技术和工具的分析与综述。

Database (Oxford). 2024 Jan 19;2024. doi: 10.1093/database/baad094.

Mal-Prec: computational prediction of protein Malonylation sites via machine learning based feature integration : Malonylation site prediction.Mal-Prec：基于机器学习的特征整合的蛋白质丙二酰化位点计算预测：丙二酰化位点预测。

BMC Genomics. 2020 Nov 23;21(1):812. doi: 10.1186/s12864-020-07166-w.

SEMal: Accurate protein malonylation site predictor using structural and evolutionary information.SEMal：利用结构和进化信息的精确蛋白质丙二酰化位点预测器。

Comput Biol Med. 2020 Oct;125:104022. doi: 10.1016/j.compbiomed.2020.104022. Epub 2020 Sep 29.

Computational analysis and prediction of lysine malonylation sites by exploiting informative features in an integrative machine-learning framework.利用综合机器学习框架中的信息特征对赖氨酸丙二酰化位点进行计算分析和预测。

Brief Bioinform. 2019 Nov 27;20(6):2185-2199. doi: 10.1093/bib/bby079.

Integration of A Deep Learning Classifier with A Random Forest Approach for Predicting Malonylation Sites.深度学习分类器与随机森林方法相结合，用于预测丙二酰化位点。

Genomics Proteomics Bioinformatics. 2018 Dec;16(6):451-459. doi: 10.1016/j.gpb.2018.08.004. Epub 2019 Jan 11.

Incorporating hybrid models into lysine malonylation sites prediction on mammalian and plant proteins.将混合模型纳入哺乳动物和植物蛋白质赖氨酸丙二酰化位点预测中。

Sci Rep. 2020 Jun 29;10(1):10541. doi: 10.1038/s41598-020-67384-w.

Prediction of Lysine Malonylation Sites Based on Pseudo Amino Acid.基于伪氨基酸的赖氨酸丙二酰化位点预测

Comb Chem High Throughput Screen. 2017;20(7):622-628. doi: 10.2174/1386207320666170314102647.

Mal-Light: Enhancing Lysine Malonylation Sites Prediction Problem Using Evolutionary-based Features.Mal-Light：利用基于进化的特征增强赖氨酸丙二酰化位点预测问题

IEEE Access. 2020;8:77888-77902. doi: 10.1109/access.2020.2989713. Epub 2020 Apr 22.

Computational Method for Identifying Malonylation Sites by Using Random Forest Algorithm.基于随机森林算法的丙二酰化修饰位点鉴定的计算方法

Comb Chem High Throughput Screen. 2020;23(4):304-312. doi: 10.2174/1386207322666181227144318.

引用本文的文献

Radiomics of Dynamic Contrast-Enhanced MRI for Predicting Radiation-Induced Hepatic Toxicity After Intensity Modulated Radiotherapy for Hepatocellular Carcinoma: A Machine Learning Predictive Model Based on the SHAP Methodology.动态对比增强磁共振成像的影像组学用于预测肝细胞癌调强放疗后放射性肝毒性：基于SHAP方法的机器学习预测模型

J Hepatocell Carcinoma. 2025 May 17;12:999-1015. doi: 10.2147/JHC.S523448. eCollection 2025.

A Computational Predictor for Accurate Identification of Tumor Homing Peptides by Integrating Sequential and Deep BiLSTM Features.一种通过整合序列和深度 BiLSTM 特征来准确识别肿瘤归巢肽的计算预测器。

Interdiscip Sci. 2024 Jun;16(2):503-518. doi: 10.1007/s12539-024-00628-9. Epub 2024 May 11.

EACVP: An ESM-2 LM Framework Combined CNN and CBAM Attention to Predict Anti-coronavirus Peptides.EACVP：一种结合卷积神经网络（CNN）和CBAM注意力机制的ESM-2语言模型框架，用于预测抗冠状病毒肽。

Curr Med Chem. 2025;32(10):2040-2054. doi: 10.2174/0109298673287899240303164403.

Database (Oxford). 2024 Jan 19;2024. doi: 10.1093/database/baad094.

本文引用的文献

DeepSADPr: A hybrid-learning architecture for serine ADP-ribosylation site prediction.DeepSADPr：一种用于丝氨酸 ADP-ribosylation 位点预测的混合学习架构。

Methods. 2022 Jul;203:575-583. doi: 10.1016/j.ymeth.2021.09.008. Epub 2021 Sep 21.

FSL-Kla: A few-shot learning-based multi-feature hybrid system for lactylation site prediction.FSL-Kla：一种基于少样本学习的用于乳酰化位点预测的多特征混合系统。

Comput Struct Biotechnol J. 2021 Aug 10;19:4497-4509. doi: 10.1016/j.csbj.2021.08.013. eCollection 2021.

Predicting phosphorylation sites using machine learning by integrating the sequence, structure, and functional information of proteins.利用机器学习整合蛋白质的序列、结构和功能信息来预测磷酸化位点。

J Transl Med. 2021 May 24;19(1):218. doi: 10.1186/s12967-021-02851-0.

A Transfer Learning-Based Approach for Lysine Propionylation Prediction.一种基于迁移学习的赖氨酸丙酰化预测方法。

Front Physiol. 2021 Apr 21;12:658633. doi: 10.3389/fphys.2021.658633. eCollection 2021.

Prediction and analysis of multiple protein lysine modified sites based on conditional wasserstein generative adversarial networks.基于条件瓦瑟斯坦生成对抗网络的多种蛋白质赖氨酸修饰位点预测与分析

BMC Bioinformatics. 2021 Mar 31;22(1):171. doi: 10.1186/s12859-021-04101-y.

Mal-Light: Enhancing Lysine Malonylation Sites Prediction Problem Using Evolutionary-based Features.Mal-Light：利用基于进化的特征增强赖氨酸丙二酰化位点预测问题

IEEE Access. 2020;8:77888-77902. doi: 10.1109/access.2020.2989713. Epub 2020 Apr 22.

BMC Genomics. 2020 Nov 23;21(1):812. doi: 10.1186/s12864-020-07166-w.

DeepPPSite: A deep learning-based model for analysis and prediction of phosphorylation sites using efficient sequence information.DeepPPSite：一种基于深度学习的模型，用于利用有效的序列信息分析和预测磷酸化位点。

Anal Biochem. 2021 Jan 1;612:113955. doi: 10.1016/j.ab.2020.113955. Epub 2020 Sep 16.

Incorporating hybrid models into lysine malonylation sites prediction on mammalian and plant proteins.将混合模型纳入哺乳动物和植物蛋白质赖氨酸丙二酰化位点预测中。

Sci Rep. 2020 Jun 29;10(1):10541. doi: 10.1038/s41598-020-67384-w.

RF-MaloSite and DL-Malosite: Methods based on random forest and deep learning to identify malonylation sites.RF-MaloSite和DL-Malosite：基于随机森林和深度学习识别丙二酰化位点的方法。

Comput Struct Biotechnol J. 2020 Mar 4;18:852-860. doi: 10.1016/j.csbj.2020.02.012. eCollection 2020.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

一种用于高效预测琥珀酰化位点的混合特征提取方案。

A hybrid feature extraction scheme for efficient malonylation site prediction.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献