AFP-Pred：一种基于序列衍生特性预测抗冻蛋白的随机森林方法。

AFP-Pred: A random forest approach for predicting antifreeze proteins from sequence-derived properties.

机构信息

Institute for Neuro- and Bioinformatics, University of Lübeck, 23538 Lübeck, Germany.

出版信息

J Theor Biol. 2011 Feb 7;270(1):56-62. doi: 10.1016/j.jtbi.2010.10.037. Epub 2010 Nov 4.

DOI:10.1016/j.jtbi.2010.10.037

Abstract

Some creatures living in extremely low temperatures can produce some special materials called "antifreeze proteins" (AFPs), which can prevent the cell and body fluids from freezing. AFPs are present in vertebrates, invertebrates, plants, bacteria, fungi, etc. Although AFPs have a common function, they show a high degree of diversity in sequences and structures. Therefore, sequence similarity based search methods often fails to predict AFPs from sequence databases. In this work, we report a random forest approach "AFP-Pred" for the prediction of antifreeze proteins from protein sequence. AFP-Pred was trained on the dataset containing 300 AFPs and 300 non-AFPs and tested on the dataset containing 181 AFPs and 9193 non-AFPs. AFP-Pred achieved 81.33% accuracy from training and 83.38% from testing. The performance of AFP-Pred was compared with BLAST and HMM. High prediction accuracy and successful of prediction of hypothetical proteins suggests that AFP-Pred can be a useful approach to identify antifreeze proteins from sequence information, irrespective of their sequence similarity.

摘要

一些生活在极低温度下的生物可以产生一些特殊的材料，称为“抗冻蛋白”（AFPs），可以防止细胞和体液结冰。AFPs 存在于脊椎动物、无脊椎动物、植物、细菌、真菌等中。尽管 AFPs 具有共同的功能，但它们在序列和结构上表现出高度的多样性。因此，基于序列相似性的搜索方法通常无法从序列数据库中预测 AFP。在这项工作中，我们报告了一种基于随机森林的方法“AFP-Pred”，用于从蛋白质序列预测抗冻蛋白。AFP-Pred 是在包含 300 个 AFP 和 300 个非 AFP 的数据集上进行训练的，并在包含 181 个 AFP 和 9193 个非 AFP 的数据集上进行了测试。AFP-Pred 在训练时达到了 81.33%的准确率，在测试时达到了 83.38%的准确率。与 BLAST 和 HMM 相比，AFP-Pred 的性能。高预测准确性和成功预测假设蛋白表明，AFP-Pred 可以成为一种从序列信息中识别抗冻蛋白的有用方法，而无需考虑它们的序列相似性。

相似文献

AFP-Pred: A random forest approach for predicting antifreeze proteins from sequence-derived properties.

J Theor Biol. 2011 Feb 7;270(1):56-62. doi: 10.1016/j.jtbi.2010.10.037. Epub 2010 Nov 4.

RAFP-Pred: Robust Prediction of Antifreeze Proteins Using Localized Analysis of n-Peptide Compositions.

IEEE/ACM Trans Comput Biol Bioinform. 2018 Jan-Feb;15(1):244-250. doi: 10.1109/TCBB.2016.2617337. Epub 2016 Oct 13.

AFP-CMBPred: Computational identification of antifreeze proteins by extending consensus sequences into multi-blocks evolutionary information.

Comput Biol Med. 2021 Dec;139:105006. doi: 10.1016/j.compbiomed.2021.105006. Epub 2021 Nov 2.

Chou's pseudo amino acid composition improves sequence-based antifreeze protein prediction.

J Theor Biol. 2014 Sep 7;356:30-5. doi: 10.1016/j.jtbi.2014.04.006. Epub 2014 Apr 13.

TargetFreeze: Identifying Antifreeze Proteins via a Combination of Weights using Sequence Evolutionary Information and Pseudo Amino Acid Composition.

J Membr Biol. 2015 Dec;248(6):1005-14. doi: 10.1007/s00232-015-9811-z. Epub 2015 Jun 10.

iAFP-Ense: An Ensemble Classifier for Identifying Antifreeze Protein by Incorporating Grey Model and PSSM into PseAAC.

J Membr Biol. 2016 Dec;249(6):845-854. doi: 10.1007/s00232-016-9935-9. Epub 2016 Nov 3.

Using support vector machine and evolutionary profiles to predict antifreeze protein sequences.

Int J Mol Sci. 2012;13(2):2196-2207. doi: 10.3390/ijms13022196. Epub 2012 Feb 17.

An Effective Antifreeze Protein Predictor with Ensemble Classifiers and Comprehensive Sequence Descriptors.

Int J Mol Sci. 2015 Sep 7;16(9):21191-214. doi: 10.3390/ijms160921191.

AFP-LSE: Antifreeze Proteins Prediction Using Latent Space Encoding of Composition of k-Spaced Amino Acid Pairs.

Sci Rep. 2020 Apr 28;10(1):7197. doi: 10.1038/s41598-020-63259-2.

Structure and application of antifreeze proteins from Antarctic bacteria.

Microb Cell Fact. 2017 Aug 7;16(1):138. doi: 10.1186/s12934-017-0737-2.

引用本文的文献

SaGP: identifying plant saline-alkali tolerance genes based on machine learning techniques.

Front Plant Sci. 2025 Jul 16;16:1629794. doi: 10.3389/fpls.2025.1629794. eCollection 2025.

BERT-DomainAFP: Antifreeze protein recognition and classification model based on BERT and structural domain annotation.

iScience. 2025 Mar 6;28(4):112077. doi: 10.1016/j.isci.2025.112077. eCollection 2025 Apr 18.

A review of artificial intelligence-assisted omics techniques in plant defense: current trends and future directions.

Front Plant Sci. 2024 Mar 5;15:1292054. doi: 10.3389/fpls.2024.1292054. eCollection 2024.

Analysis of Ice-Binding Protein Evolution.

Methods Mol Biol. 2024;2730:219-229. doi: 10.1007/978-1-0716-3503-2_16.

Ensemble Learning for Hormone Binding Protein Prediction: A Promising Approach for Early Diagnosis of Thyroid Hormone Disorders in Serum.

Diagnostics (Basel). 2023 Jun 1;13(11):1940. doi: 10.3390/diagnostics13111940.

Psychrophilic Yeasts: Insights into Their Adaptability to Extremely Cold Environments.

Genes (Basel). 2023 Jan 6;14(1):158. doi: 10.3390/genes14010158.

Sourcing thermotolerant poly(ethylene terephthalate) hydrolase scaffolds from natural diversity.

Nat Commun. 2022 Dec 21;13(1):7850. doi: 10.1038/s41467-022-35237-x.

Prediction of antifreeze proteins using machine learning.

Sci Rep. 2022 Nov 30;12(1):20672. doi: 10.1038/s41598-022-24501-1.

Cold adaptation strategies in plants-An emerging role of epigenetics and antifreeze proteins to engineer cold resilient plants.

Front Genet. 2022 Aug 25;13:909007. doi: 10.3389/fgene.2022.909007. eCollection 2022.

Screening gene signatures for clinical response subtypes of lung transplantation.

Mol Genet Genomics. 2022 Sep;297(5):1301-1313. doi: 10.1007/s00438-022-01918-x. Epub 2022 Jul 3.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

AFP-Pred：一种基于序列衍生特性预测抗冻蛋白的随机森林方法。

AFP-Pred: A random forest approach for predicting antifreeze proteins from sequence-derived properties.

机构信息

出版信息

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献