• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

StackedEnC-AOP:基于多尺度向量的转换进化和序列特征与堆叠集成学习预测抗氧化蛋白。

StackedEnC-AOP: prediction of antioxidant proteins using transform evolutionary and sequential features based multi-scale vector with stacked ensemble learning.

机构信息

Department of Zoology, Abdul Wali Khan University Mardan, Mardan, 23200, KP, Pakistan.

Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu, 610054, People's Republic of China.

出版信息

BMC Bioinformatics. 2024 Aug 4;25(1):256. doi: 10.1186/s12859-024-05884-6.

DOI:10.1186/s12859-024-05884-6
PMID:39098908
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11298090/
Abstract

BACKGROUND

Antioxidant proteins are involved in several biological processes and can protect DNA and cells from the damage of free radicals. These proteins regulate the body's oxidative stress and perform a significant role in many antioxidant-based drugs. The current invitro-based medications are costly, time-consuming, and unable to efficiently screen and identify the targeted motif of antioxidant proteins.

METHODS

In this model, we proposed an accurate prediction method to discriminate antioxidant proteins namely StackedEnC-AOP. The training sequences are formulation encoded via incorporating a discrete wavelet transform (DWT) into the evolutionary matrix to decompose the PSSM-based images via two levels of DWT to form a Pseudo position-specific scoring matrix (PsePSSM-DWT) based embedded vector. Additionally, the Evolutionary difference formula and composite physiochemical properties methods are also employed to collect the structural and sequential descriptors. Then the combined vector of sequential features, evolutionary descriptors, and physiochemical properties is produced to cover the flaws of individual encoding schemes. To reduce the computational cost of the combined features vector, the optimal features are chosen using Minimum redundancy and maximum relevance (mRMR). The optimal feature vector is trained using a stacking-based ensemble meta-model.

RESULTS

Our developed StackedEnC-AOP method reported a prediction accuracy of 98.40% and an AUC of 0.99 via training sequences. To evaluate model validation, the StackedEnC-AOP training model using an independent set achieved an accuracy of 96.92% and an AUC of 0.98.

CONCLUSION

Our proposed StackedEnC-AOP strategy performed significantly better than current computational models with a ~ 5% and ~ 3% improved accuracy via training and independent sets, respectively. The efficacy and consistency of our proposed StackedEnC-AOP make it a valuable tool for data scientists and can execute a key role in research academia and drug design.

摘要

背景

抗氧化蛋白参与多种生物学过程,可以保护 DNA 和细胞免受自由基的损伤。这些蛋白质调节体内的氧化应激,在许多基于抗氧化剂的药物中发挥重要作用。目前基于体外的药物昂贵、耗时,并且不能有效地筛选和识别抗氧化蛋白的靶向基序。

方法

在本模型中,我们提出了一种准确的预测方法来区分抗氧化蛋白,即 StackedEnC-AOP。训练序列通过将离散小波变换 (DWT) 纳入进化矩阵中进行编码,通过两级 DWT 对基于 PSSM 的图像进行分解,形成基于伪位置特异性评分矩阵 (PsePSSM-DWT) 的嵌入式向量。此外,还采用进化差异公式和复合理化性质方法收集结构和序列描述符。然后,生成序列特征、进化描述符和理化性质的组合向量,以弥补单个编码方案的缺陷。为了降低组合特征向量的计算成本,使用最小冗余最大相关性 (mRMR) 选择最优特征。使用基于堆叠的集成元模型对最优特征向量进行训练。

结果

我们开发的 StackedEnC-AOP 方法在训练序列中报告了 98.40%的预测精度和 0.99 的 AUC。为了评估模型验证,StackedEnC-AOP 训练模型使用独立集实现了 96.92%的准确率和 0.98 的 AUC。

结论

与当前的计算模型相比,我们提出的 StackedEnC-AOP 策略在训练集和独立集上的准确率分别提高了约 5%和 3%,表现出显著的优势。我们提出的 StackedEnC-AOP 的功效和一致性使其成为数据科学家的有价值工具,并可以在研究学术界和药物设计中发挥关键作用。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ca38/11298090/b6c7188f6c9c/12859_2024_5884_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ca38/11298090/1c83a3aaf923/12859_2024_5884_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ca38/11298090/0e5a991f6564/12859_2024_5884_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ca38/11298090/6aa46d497be8/12859_2024_5884_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ca38/11298090/7b1b62066517/12859_2024_5884_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ca38/11298090/b193cdc76f8c/12859_2024_5884_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ca38/11298090/b02c50366029/12859_2024_5884_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ca38/11298090/b6c7188f6c9c/12859_2024_5884_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ca38/11298090/1c83a3aaf923/12859_2024_5884_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ca38/11298090/0e5a991f6564/12859_2024_5884_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ca38/11298090/6aa46d497be8/12859_2024_5884_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ca38/11298090/7b1b62066517/12859_2024_5884_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ca38/11298090/b193cdc76f8c/12859_2024_5884_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ca38/11298090/b02c50366029/12859_2024_5884_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ca38/11298090/b6c7188f6c9c/12859_2024_5884_Fig7_HTML.jpg

相似文献

1
StackedEnC-AOP: prediction of antioxidant proteins using transform evolutionary and sequential features based multi-scale vector with stacked ensemble learning.StackedEnC-AOP:基于多尺度向量的转换进化和序列特征与堆叠集成学习预测抗氧化蛋白。
BMC Bioinformatics. 2024 Aug 4;25(1):256. doi: 10.1186/s12859-024-05884-6.
2
Target-DBPPred: An intelligent model for prediction of DNA-binding proteins using discrete wavelet transform based compression and light eXtreme gradient boosting.目标-DBPPred:一种使用基于离散小波变换的压缩和轻极限梯度提升的智能 DNA 结合蛋白预测模型。
Comput Biol Med. 2022 Jun;145:105533. doi: 10.1016/j.compbiomed.2022.105533. Epub 2022 Apr 16.
3
DP-BINDER: machine learning model for prediction of DNA-binding proteins by fusing evolutionary and physicochemical information.DP-BINDER:一种通过融合进化和物理化学信息来预测 DNA 结合蛋白的机器学习模型。
J Comput Aided Mol Des. 2019 Jul;33(7):645-658. doi: 10.1007/s10822-019-00207-x. Epub 2019 May 23.
4
Deepstacked-AVPs: predicting antiviral peptides using tri-segment evolutionary profile and word embedding based multi-perspective features with deep stacking model.深度堆叠 AVPs:使用三片段进化特征和基于单词嵌入的多视角特征与深度堆叠模型预测抗病毒肽。
BMC Bioinformatics. 2024 Mar 7;25(1):102. doi: 10.1186/s12859-024-05726-5.
5
CrystalM: A Multi-View Fusion Approach for Protein Crystallization Prediction.CrystalM:一种用于蛋白质结晶预测的多视图融合方法。
IEEE/ACM Trans Comput Biol Bioinform. 2021 Jan-Feb;18(1):325-335. doi: 10.1109/TCBB.2019.2912173. Epub 2021 Feb 3.
6
iAFPs-Mv-BiTCN: Predicting antifungal peptides using self-attention transformer embedding and transform evolutionary based multi-view features with bidirectional temporal convolutional networks.iAFPs-Mv-BiTCN:使用自注意力转换器嵌入和基于进化的多视图特征与双向时间卷积网络预测抗真菌肽。
Artif Intell Med. 2024 May;151:102860. doi: 10.1016/j.artmed.2024.102860. Epub 2024 Mar 26.
7
DBP-GAPred: An intelligent method for prediction of DNA-binding proteins types by enhanced evolutionary profile features with ensemble learning.DBP-GAPred:一种通过增强进化轮廓特征与集成学习预测 DNA 结合蛋白类型的智能方法。
J Bioinform Comput Biol. 2021 Aug;19(4):2150018. doi: 10.1142/S0219720021500189. Epub 2021 Jul 21.
8
Stack-VTP: prediction of vesicle transport proteins based on stacked ensemble classifier and evolutionary information.Stack-VTP:基于堆叠集成分类器和进化信息的囊泡转运蛋白预测。
BMC Bioinformatics. 2023 Apr 7;24(1):137. doi: 10.1186/s12859-023-05257-5.
9
Accurate Identification of Antioxidant Proteins Based on a Combination of Machine Learning Techniques and Hidden Markov Model Profiles.基于机器学习技术和隐马尔可夫模型谱的抗氧化蛋白的准确识别。
Comput Math Methods Med. 2021 Aug 7;2021:5770981. doi: 10.1155/2021/5770981. eCollection 2021.
10
Machine learning based identification of protein-protein interactions using derived features of physiochemical properties and evolutionary profiles.基于机器学习,利用理化性质和进化谱的衍生特征识别蛋白质-蛋白质相互作用。
Artif Intell Med. 2017 May;78:61-71. doi: 10.1016/j.artmed.2017.06.006. Epub 2017 Jun 13.

引用本文的文献

1
StackGlyEmbed: prediction of N-linked glycosylation sites using protein language models.StackGlyEmbed:使用蛋白质语言模型预测N-糖基化位点
Bioinform Adv. 2025 Jun 28;5(1):vbaf146. doi: 10.1093/bioadv/vbaf146. eCollection 2025.
2
Classification of Acid and Alkaline Enzymes Based on Normalized Van der Waals Volume Features.基于归一化范德华体积特征的酸碱酶分类
Proteomics Clin Appl. 2025 Jul;19(4):e70009. doi: 10.1002/prca.70009. Epub 2025 May 31.
3
Application of a grey wolf optimization-enhanced convolutional neural network and bidirectional gated recurrent unit model for credit scoring prediction.

本文引用的文献

1
DeepAVP-TPPred: identification of antiviral peptides using transformed image-based localized descriptors and binary tree growth algorithm.DeepAVP-TPPred:使用变换图像的局部描述符和二叉树生长算法鉴定抗病毒肽。
Bioinformatics. 2024 May 2;40(5). doi: 10.1093/bioinformatics/btae305.
2
iAFPs-Mv-BiTCN: Predicting antifungal peptides using self-attention transformer embedding and transform evolutionary based multi-view features with bidirectional temporal convolutional networks.iAFPs-Mv-BiTCN:使用自注意力转换器嵌入和基于进化的多视图特征与双向时间卷积网络预测抗真菌肽。
Artif Intell Med. 2024 May;151:102860. doi: 10.1016/j.artmed.2024.102860. Epub 2024 Mar 26.
3
灰狼优化增强卷积神经网络与双向门控循环单元模型在信用评分预测中的应用
PLoS One. 2025 May 27;20(5):e0322225. doi: 10.1371/journal.pone.0322225. eCollection 2025.
4
Optimizing lipocalin sequence classification with ensemble deep learning models.使用集成深度学习模型优化脂钙蛋白序列分类
PLoS One. 2025 Apr 16;20(4):e0319329. doi: 10.1371/journal.pone.0319329. eCollection 2025.
5
pNPs-CapsNet: Predicting Neuropeptides Using Protein Language Models and FastText Encoding-Based Weighted Multi-View Feature Integration with Deep Capsule Neural Network.pNPs-CapsNet:使用蛋白质语言模型和基于FastText编码的加权多视图特征集成与深度胶囊神经网络预测神经肽
ACS Omega. 2025 Mar 18;10(12):12403-12416. doi: 10.1021/acsomega.4c11449. eCollection 2025 Apr 1.
6
Classification of pulmonary diseases from chest radiographs using deep transfer learning.使用深度迁移学习从胸部X光片对肺部疾病进行分类。
PLoS One. 2025 Mar 17;20(3):e0316929. doi: 10.1371/journal.pone.0316929. eCollection 2025.
7
Early warning strategies for corporate operational risk: A study by an improved random forest algorithm using FCM clustering.企业运营风险的早期预警策略:基于使用模糊C均值聚类的改进随机森林算法的研究
PLoS One. 2025 Mar 11;20(3):e0318491. doi: 10.1371/journal.pone.0318491. eCollection 2025.
8
Smart waste classification in IoT-enabled smart cities using VGG16 and Cat Swarm Optimized random forest.在使用VGG16和猫群优化随机森林的物联网智能城市中进行智能垃圾分类。
PLoS One. 2025 Feb 28;20(2):e0316930. doi: 10.1371/journal.pone.0316930. eCollection 2025.
9
Explainable AI-driven prediction of APE1 inhibitors: enhancing cancer therapy with machine learning models and feature importance analysis.可解释人工智能驱动的APE1抑制剂预测:利用机器学习模型和特征重要性分析增强癌症治疗
Mol Divers. 2025 Feb 21. doi: 10.1007/s11030-025-11133-6.
10
Addressing imbalanced data classification with Cluster-Based Reduced Noise SMOTE.基于聚类的降噪合成少数过采样技术解决不平衡数据分类问题
PLoS One. 2025 Feb 10;20(2):e0317396. doi: 10.1371/journal.pone.0317396. eCollection 2025.
Deepstacked-AVPs: predicting antiviral peptides using tri-segment evolutionary profile and word embedding based multi-perspective features with deep stacking model.
深度堆叠 AVPs:使用三片段进化特征和基于单词嵌入的多视角特征与深度堆叠模型预测抗病毒肽。
BMC Bioinformatics. 2024 Mar 7;25(1):102. doi: 10.1186/s12859-024-05726-5.
4
A supervised machine learning approach for the prediction of antioxidant activities of seed.一种用于预测种子抗氧化活性的监督式机器学习方法。
Heliyon. 2024 Jan 18;10(3):e24506. doi: 10.1016/j.heliyon.2024.e24506. eCollection 2024 Feb 15.
5
Mitigation of arsenic poisoning induced oxidative stress and genotoxicity by Ocimum gratissimum L.印度蓍草减轻砷中毒引起的氧化应激和遗传毒性
Toxicon. 2024 Feb 1;238:107603. doi: 10.1016/j.toxicon.2024.107603. Epub 2024 Jan 4.
6
A novel strategy by combining foam fractionation with high-speed countercurrent chromatography for the rapid and efficient isolation of antioxidants and cytostatics from Camellia oleifera cake.一种从油茶饼中快速高效分离抗氧化剂和细胞抑制剂的新策略,即将泡沫分离与高速逆流色谱相结合。
Food Res Int. 2024 Jan;176:113798. doi: 10.1016/j.foodres.2023.113798. Epub 2023 Dec 4.
7
Oxidative Stress in Health and Disease.健康与疾病中的氧化应激
Biomedicines. 2023 Oct 29;11(11):2925. doi: 10.3390/biomedicines11112925.
8
Differentiation of protein types extracted from tilapia byproducts by FTIR spectroscopy combined with chemometric analysis and their antioxidant protein hydrolysates.利用傅里叶变换红外光谱结合化学计量分析对罗非鱼副产物中提取的蛋白质类型进行区分及其抗氧化蛋白水解产物。
Food Chem. 2024 Mar 30;437(Pt 2):137862. doi: 10.1016/j.foodchem.2023.137862. Epub 2023 Oct 26.
9
AIPs-SnTCN: Predicting Anti-Inflammatory Peptides Using fastText and Transformer Encoder-Based Hybrid Word Embedding with Self-Normalized Temporal Convolutional Networks.AIPs-SnTCN:使用基于fastText和基于Transformer编码器的混合词嵌入与自归一化时间卷积网络预测抗炎肽
J Chem Inf Model. 2023 Nov 13;63(21):6537-6554. doi: 10.1021/acs.jcim.3c01563. Epub 2023 Oct 31.
10
Machine learning-based antioxidant protein identification model: Progress and evaluation.基于机器学习的抗氧化蛋白鉴定模型:进展与评估。
J Cell Biochem. 2023 Nov;124(11):1825-1834. doi: 10.1002/jcb.30491. Epub 2023 Oct 25.