AFP-CMBPred：通过将共识序列扩展到多块进化信息来计算识别抗冻蛋白。

AFP-CMBPred: Computational identification of antifreeze proteins by extending consensus sequences into multi-blocks evolutionary information.

机构信息

School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing, China.

Department of Computer Science, Abdul Wali Khan University Mardan, Pakistan.

出版信息

Comput Biol Med. 2021 Dec;139:105006. doi: 10.1016/j.compbiomed.2021.105006. Epub 2021 Nov 2.

DOI:10.1016/j.compbiomed.2021.105006

PMID:34749096

Abstract

In extremely cold environments, living organisms like plants, animals, fishes, and microbes can die due to the intracellular ice formation in their bodies. To sustain life in such cold environments, some cold-blooded species produced Antifreeze proteins (AFPs), also called ice-binding proteins. AFPs are not only limited to the medical field but also have diverse significance in the area of biotechnology, agriculture, and the food industry. Different AFPs exhibit high heterogeneity in their structures and sequences. Keeping the significance of AFPs, several machine-learning-based models have been developed by scientists for the prediction of AFPs. However, due to the complex and diverse nature of AFPs, the prediction performance of the existing methods is limited. Therefore, it is highly indispensable for researchers to develop a reliable computational model that can accurately predict AFPs. In this connection, this study presents a novel predictor for AFPs, named AFP-CMBPred. The sequences of AFPs are formulated via four different feature representation methods, such as Amphiphilic pseudo amino acid composition (Amp-PseAAC), Dipeptide Deviation from Expected Mean (DDE), Multi-Blocks Position Specific Scoring Matrix (MB-PSSM), and Consensus Sequence-based on Multi-Blocks Position Specific Scoring Matrix (CS-MB-PSSM) to collect local and global descriptors. In the next step, the extracted feature vectors are evaluated via Support Vector Machine (SVM) and Random Forest (RF) based classification learners. The prediction performance of both classifiers is further assessed using three validation methods i.e., jackknife test, 10-fold cross-validation test, and independent test. After examining the prediction rates of all validation tests, it was found that our proposed model achieved the higher prediction accuracies of ∼2.65%, ∼2.84%, and ∼3.37% using jackknife, K-fold, and independent test, respectively. The experimental outcomes validate that our proposed "AFP-CMBPred" predictor secured the highest prediction results than the existing models for the identification of AFPs. It is further anticipated that our proposed AFP-CMBPred model will be considered a valuable tool in the research academia and drug development.

摘要

在极冷的环境中，植物、动物、鱼类和微生物等生物体会因体内细胞内冰的形成而死亡。为了在这种寒冷的环境中维持生命，一些冷血物种产生了抗冻蛋白（AFP），也称为冰结合蛋白。AFP 不仅限于医学领域，在生物技术、农业和食品工业领域也具有多种意义。不同的 AFP 在结构和序列上表现出高度的异质性。鉴于 AFP 的重要性，科学家们已经开发了几种基于机器学习的模型来预测 AFP。然而，由于 AFP 的复杂性和多样性，现有方法的预测性能受到限制。因此，研究人员开发一种能够准确预测 AFP 的可靠计算模型是非常必要的。在这方面，本研究提出了一种新的 AFP 预测器，命名为 AFP-CMBPred。通过四种不同的特征表示方法（如两亲性伪氨基酸组成（Amp-PseAAC）、二肽偏离预期均值（DDE）、多块位置特异性评分矩阵（MB-PSSM）和基于多块位置特异性评分矩阵的共识序列（CS-MB-PSSM））来制定 AFP 的序列，以收集局部和全局描述符。下一步，通过支持向量机（SVM）和随机森林（RF）分类器对提取的特征向量进行评估。然后使用三种验证方法（即折刀测试、10 折交叉验证测试和独立测试）进一步评估这两种分类器的预测性能。在检查了所有验证测试的预测率后，发现我们提出的模型在使用折刀、K 折和独立测试时分别获得了约 2.65%、2.84%和 3.37%的更高预测准确率。实验结果验证了我们提出的“AFP-CMBPred”预测器在识别 AFP 方面比现有模型获得了更高的预测结果。进一步预计，我们提出的 AFP-CMBPred 模型将成为研究学术界和药物开发的有价值工具。

相似文献

AFP-CMBPred: Computational identification of antifreeze proteins by extending consensus sequences into multi-blocks evolutionary information.AFP-CMBPred：通过将共识序列扩展到多块进化信息来计算识别抗冻蛋白。

Comput Biol Med. 2021 Dec;139:105006. doi: 10.1016/j.compbiomed.2021.105006. Epub 2021 Nov 2.

AFP-SPTS: An Accurate Prediction of Antifreeze Proteins Using Sequential and Pseudo-Tri-Slicing Evolutionary Features with an Extremely Randomized Tree.使用顺序和伪三切片进化特征以及极端随机树对抗冻蛋白进行准确预测。

J Chem Inf Model. 2023 Feb 13;63(3):826-834. doi: 10.1021/acs.jcim.2c01417. Epub 2023 Jan 17.

Prediction of antifreeze proteins using machine learning.使用机器学习预测抗冻蛋白。

Sci Rep. 2022 Nov 30;12(1):20672. doi: 10.1038/s41598-022-24501-1.

Using support vector machine and evolutionary profiles to predict antifreeze protein sequences.利用支持向量机和进化谱预测抗冻蛋白序列。

Int J Mol Sci. 2012;13(2):2196-2207. doi: 10.3390/ijms13022196. Epub 2012 Feb 17.

iAFP-Ense: An Ensemble Classifier for Identifying Antifreeze Protein by Incorporating Grey Model and PSSM into PseAAC.iAFP-Ense：一种通过将灰色模型和位置特异性得分矩阵融入伪氨基酸组成来识别抗冻蛋白的集成分类器。

J Membr Biol. 2016 Dec;249(6):845-854. doi: 10.1007/s00232-016-9935-9. Epub 2016 Nov 3.

TargetFreeze: Identifying Antifreeze Proteins via a Combination of Weights using Sequence Evolutionary Information and Pseudo Amino Acid Composition.TargetFreeze：通过结合使用序列进化信息和伪氨基酸组成的权重来鉴定抗冻蛋白

J Membr Biol. 2015 Dec;248(6):1005-14. doi: 10.1007/s00232-015-9811-z. Epub 2015 Jun 10.

AFP-Pred: A random forest approach for predicting antifreeze proteins from sequence-derived properties.AFP-Pred：一种基于序列衍生特性预测抗冻蛋白的随机森林方法。

J Theor Biol. 2011 Feb 7;270(1):56-62. doi: 10.1016/j.jtbi.2010.10.037. Epub 2010 Nov 4.

AFP-LSE: Antifreeze Proteins Prediction Using Latent Space Encoding of Composition of k-Spaced Amino Acid Pairs.AFP-LSE：使用 k 间隔氨基酸对组成的潜在空间编码预测抗冻蛋白。

Sci Rep. 2020 Apr 28;10(1):7197. doi: 10.1038/s41598-020-63259-2.

An Effective Antifreeze Protein Predictor with Ensemble Classifiers and Comprehensive Sequence Descriptors.一种使用集成分类器和综合序列描述符的有效抗冻蛋白预测器。

Int J Mol Sci. 2015 Sep 7;16(9):21191-214. doi: 10.3390/ijms160921191.

RAFP-Pred: Robust Prediction of Antifreeze Proteins Using Localized Analysis of n-Peptide Compositions.RAFP-Pred：使用 n-肽组成的局部分析进行抗冻蛋白的稳健预测。

IEEE/ACM Trans Comput Biol Bioinform. 2018 Jan-Feb;15(1):244-250. doi: 10.1109/TCBB.2016.2617337. Epub 2016 Oct 13.

引用本文的文献

BERT-DomainAFP: Antifreeze protein recognition and classification model based on BERT and structural domain annotation.BERT-DomainAFP：基于BERT和结构域注释的抗冻蛋白识别与分类模型

iScience. 2025 Mar 6;28(4):112077. doi: 10.1016/j.isci.2025.112077. eCollection 2025 Apr 18.

Classification of pulmonary diseases from chest radiographs using deep transfer learning.使用深度迁移学习从胸部X光片对肺部疾病进行分类。

PLoS One. 2025 Mar 17;20(3):e0316929. doi: 10.1371/journal.pone.0316929. eCollection 2025.

Leveraging deep learning for epigenetic protein prediction: a novel approach for early lung cancer diagnosis and drug discovery.利用深度学习进行表观遗传蛋白预测：一种早期肺癌诊断和药物发现的新方法。

Health Inf Sci Syst. 2025 Mar 11;13(1):28. doi: 10.1007/s13755-025-00347-5. eCollection 2025 Dec.

XGBoost-enhanced ensemble model using discriminative hybrid features for the prediction of sumoylation sites.使用判别性混合特征的XGBoost增强集成模型用于预测SUMO化位点。

BioData Min. 2025 Feb 3;18(1):12. doi: 10.1186/s13040-024-00415-8.

pACP-HybDeep: predicting anticancer peptides using binary tree growth based transformer and structural feature encoding with deep-hybrid learning.pACP-HybDeep：基于二叉树生长的变压器和深度混合学习的结构特征编码预测抗癌肽

Sci Rep. 2025 Jan 2;15(1):565. doi: 10.1038/s41598-024-84146-0.

Deep-GB: A novel deep learning model for globular protein prediction using CNN-BiLSTM architecture and enhanced PSSM with trisection strategy.深度GB：一种使用CNN-BiLSTM架构和采用三等分策略增强的PSSM进行球状蛋白质预测的新型深度学习模型。

IET Syst Biol. 2024 Dec;18(6):208-217. doi: 10.1049/syb2.12108. Epub 2024 Nov 8.

AI based predictive acceptability model for effective vaccine delivery in healthcare systems.基于人工智能的预测可接受性模型，用于医疗保健系统中的有效疫苗接种。

Sci Rep. 2024 Nov 4;14(1):26657. doi: 10.1038/s41598-024-76891-z.

StackedEnC-AOP: prediction of antioxidant proteins using transform evolutionary and sequential features based multi-scale vector with stacked ensemble learning.StackedEnC-AOP：基于多尺度向量的转换进化和序列特征与堆叠集成学习预测抗氧化蛋白。

BMC Bioinformatics. 2024 Aug 4;25(1):256. doi: 10.1186/s12859-024-05884-6.

ENCAP: Computational prediction of tumor T cell antigens with ensemble classifiers and diverse sequence features.ENCAP：使用集成分类器和多种序列特征进行肿瘤 T 细胞抗原的计算预测。

PLoS One. 2024 Jul 18;19(7):e0307176. doi: 10.1371/journal.pone.0307176. eCollection 2024.

Meta-2OM: A multi-classifier meta-model for the accurate prediction of RNA 2'-O-methylation sites in human RNA.Meta-2OM：一种用于准确预测人类 RNA 2'-O-甲基化位点的多分类器元模型。

PLoS One. 2024 Jun 26;19(6):e0305406. doi: 10.1371/journal.pone.0305406. eCollection 2024.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

AFP-CMBPred：通过将共识序列扩展到多块进化信息来计算识别抗冻蛋白。

AFP-CMBPred: Computational identification of antifreeze proteins by extending consensus sequences into multi-blocks evolutionary information.

机构信息

出版信息

相似文献

引用本文的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献