• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

机器学习视角下蛋白质功能预测综述。

A review of protein function prediction under machine learning perspective.

作者信息

Bernardes Juliana S, Pedreira Carlos E

机构信息

Federal University of Rio de Janeiro UFRJ, COPPE-Engineering Graduate Program.

出版信息

Recent Pat Biotechnol. 2013 Aug;7(2):122-41. doi: 10.2174/18722083113079990006.

DOI:10.2174/18722083113079990006
PMID:23848274
Abstract

Protein function prediction is one of the most challenging problems in the post-genomic era. The number of newly identified proteins has been exponentially increasing with the advances of the high-throughput techniques. However, the functional characterization of these new proteins was not incremented in the same proportion. To fill this gap, a large number of computational methods have been proposed in the literature. Early approaches have explored homology relationships to associate known functions to the newly discovered proteins. Nevertheless, these approaches tend to fail when a new protein is considerably different (divergent) from previously known ones. Accordingly, more accurate approaches, that use expressive data representation and explore sophisticate computational techniques are required. Regarding these points, this review provides a comprehensible description of machine learning approaches that are currently applied to protein function prediction problems. We start by defining several problems enrolled in understanding protein function aspects, and describing how machine learning can be applied to these problems. We aim to expose, in a systematical framework, the role of these techniques in protein function inference, sometimes difficult to follow up due to the rapid evolvement of the field. With this purpose in mind, we highlight the most representative contributions, the recent advancements, and provide an insightful categorization and classification of machine learning methods in functional proteomics.

摘要

蛋白质功能预测是后基因组时代最具挑战性的问题之一。随着高通量技术的进步,新鉴定出的蛋白质数量呈指数级增长。然而,这些新蛋白质的功能表征并没有以相同的比例增加。为了填补这一空白,文献中提出了大量的计算方法。早期的方法探索了同源关系,以便将已知功能与新发现的蛋白质联系起来。然而,当一种新蛋白质与先前已知的蛋白质有很大差异(分化)时,这些方法往往会失败。因此,需要更精确的方法,这些方法使用富有表现力的数据表示并探索复杂的计算技术。关于这些要点,本综述对目前应用于蛋白质功能预测问题的机器学习方法进行了全面的描述。我们首先定义了在理解蛋白质功能方面涉及的几个问题,并描述了机器学习如何应用于这些问题。我们旨在在一个系统的框架中揭示这些技术在蛋白质功能推断中的作用,由于该领域的快速发展,这些作用有时难以跟进。出于这个目的,我们突出了最具代表性的贡献、最新进展,并对功能蛋白质组学中的机器学习方法进行了有见地的分类。

相似文献

1
A review of protein function prediction under machine learning perspective.机器学习视角下蛋白质功能预测综述。
Recent Pat Biotechnol. 2013 Aug;7(2):122-41. doi: 10.2174/18722083113079990006.
2
Protein function prediction with high-throughput data.利用高通量数据进行蛋白质功能预测。
Amino Acids. 2008 Oct;35(3):517-30. doi: 10.1007/s00726-008-0077-y. Epub 2008 Apr 22.
3
Protein fold recognition using the gradient boost algorithm.使用梯度提升算法进行蛋白质折叠识别。
Comput Syst Bioinformatics Conf. 2006:43-53.
4
Global sequence properties for superfamily prediction: a machine learning approach.用于超家族预测的全局序列特性:一种机器学习方法。
J Integr Bioinform. 2009 Aug 23;6(1):109. doi: 10.2390/biecoll-jib-2009-109.
5
Probabilistic models and machine learning in structural bioinformatics.结构生物信息学中的概率模型与机器学习
Stat Methods Med Res. 2009 Oct;18(5):505-26. doi: 10.1177/0962280208099492. Epub 2009 Jan 19.
6
A Review for Artificial Intelligence Based Protein Subcellular Localization.基于人工智能的蛋白质亚细胞定位研究综述
Biomolecules. 2024 Mar 27;14(4):409. doi: 10.3390/biom14040409.
7
Predicting protein function by machine learning on amino acid sequences--a critical evaluation.通过对氨基酸序列进行机器学习来预测蛋白质功能——一项批判性评估。
BMC Genomics. 2007 Mar 20;8:78. doi: 10.1186/1471-2164-8-78.
8
SVM-Fold: a tool for discriminative multi-class protein fold and superfamily recognition.支持向量机折叠法:一种用于判别式多类别蛋白质折叠和超家族识别的工具。
BMC Bioinformatics. 2007 May 22;8 Suppl 4(Suppl 4):S2. doi: 10.1186/1471-2105-8-S4-S2.
9
Artificial intelligence and machine learning for protein toxicity prediction using proteomics data.利用蛋白质组学数据进行蛋白质毒性预测的人工智能和机器学习。
Chem Biol Drug Des. 2020 Sep;96(3):902-920. doi: 10.1111/cbdd.13701.
10
SVM-HUSTLE--an iterative semi-supervised machine learning approach for pairwise protein remote homology detection.SVM-HUSTLE——一种用于成对蛋白质远程同源性检测的迭代半监督机器学习方法。
Bioinformatics. 2008 Mar 15;24(6):783-90. doi: 10.1093/bioinformatics/btn028. Epub 2008 Feb 1.

引用本文的文献

1
A Survey of Biological Function Prediction Methods with Focus on Natural Language Processing (NLP) and Large Language Models (LLM).以自然语言处理(NLP)和大语言模型(LLM)为重点的生物功能预测方法综述。
Methods Mol Biol. 2025;2941:201-225. doi: 10.1007/978-1-0716-4623-6_13.
2
Revealing arginine-cysteine and glycine-cysteine NOS linkages by a systematic re-evaluation of protein structures.通过对蛋白质结构进行系统的重新评估来揭示精氨酸-半胱氨酸和甘氨酸-半胱氨酸一氧化氮合酶连接
Commun Chem. 2025 May 13;8(1):146. doi: 10.1038/s42004-025-01535-w.
3
Protein structure prediction via deep learning: an in-depth review.
基于深度学习的蛋白质结构预测:深入综述
Front Pharmacol. 2025 Apr 3;16:1498662. doi: 10.3389/fphar.2025.1498662. eCollection 2025.
4
Evaluating the advancements in protein language models for encoding strategies in protein function prediction: a comprehensive review.评估蛋白质语言模型在蛋白质功能预测编码策略方面的进展:全面综述。
Front Bioeng Biotechnol. 2025 Jan 21;13:1506508. doi: 10.3389/fbioe.2025.1506508. eCollection 2025.
5
Numerical stability of DeepGOPlus inference.DeepGOPlus 推断的数值稳定性。
PLoS One. 2024 Jan 29;19(1):e0296725. doi: 10.1371/journal.pone.0296725. eCollection 2024.
6
Recognizing the power of machine learning and other computational methods to accelerate progress in small molecule targeting of RNA.认识到机器学习和其他计算方法的力量,以加速小分子靶向 RNA 的进展。
RNA. 2023 Apr;29(4):473-488. doi: 10.1261/rna.079497.122. Epub 2023 Jan 24.
7
Functional characterization of prokaryotic dark matter: the road so far and what lies ahead.原核生物暗物质的功能表征:迄今为止的进展与未来展望。
Curr Res Microb Sci. 2022 Aug 7;3:100159. doi: 10.1016/j.crmicr.2022.100159. eCollection 2022.
8
PreAcrs: a machine learning framework for identifying anti-CRISPR proteins.预 Acrs:一种用于识别抗 CRISPR 蛋白的机器学习框架。
BMC Bioinformatics. 2022 Oct 25;23(1):444. doi: 10.1186/s12859-022-04986-3.
9
Protein Science Meets Artificial Intelligence: A Systematic Review and a Biochemical Meta-Analysis of an Inter-Field.蛋白质科学与人工智能相遇:跨领域的系统评价与生化荟萃分析
Front Bioeng Biotechnol. 2022 Jul 7;10:788300. doi: 10.3389/fbioe.2022.788300. eCollection 2022.
10
Gene function prediction in five model eukaryotes exclusively based on gene relative location through machine learning.基于机器学习,仅通过基因相对位置对五个模式真核生物的基因功能进行预测。
Sci Rep. 2022 Jul 8;12(1):11655. doi: 10.1038/s41598-022-15329-w.