计算机模拟蛋白质功能预测:基于机器学习方法的兴起

In silico protein function prediction: the rise of machine learning-based approaches.

作者信息

Chen Jiaxiao, Gu Zhonghui, Lai Luhua, Pei Jianfeng

机构信息

Center for Quantitative Biology, Academy for Advanced Interdisciplinary Studies, Peking University, Beijing, China.

Peking-Tsinghua Center for Life Sciences, Academy for Advanced Interdisciplinary Studies, Peking University, Beijing, China.

出版信息

Med Rev (2021). 2023 Nov 29;3(6):487-510. doi: 10.1515/mr-2023-0038. eCollection 2023 Dec.

Abstract

Proteins function as integral actors in essential life processes, rendering the realm of protein research a fundamental domain that possesses the potential to propel advancements in pharmaceuticals and disease investigation. Within the context of protein research, an imperious demand arises to uncover protein functionalities and untangle intricate mechanistic underpinnings. Due to the exorbitant costs and limited throughput inherent in experimental investigations, computational models offer a promising alternative to accelerate protein function annotation. In recent years, protein pre-training models have exhibited noteworthy advancement across multiple prediction tasks. This advancement highlights a notable prospect for effectively tackling the intricate downstream task associated with protein function prediction. In this review, we elucidate the historical evolution and research paradigms of computational methods for predicting protein function. Subsequently, we summarize the progress in protein and molecule representation as well as feature extraction techniques. Furthermore, we assess the performance of machine learning-based algorithms across various objectives in protein function prediction, thereby offering a comprehensive perspective on the progress within this field.

摘要

蛋白质在基本生命过程中起着不可或缺的作用,使蛋白质研究领域成为一个具有推动药物研发和疾病研究进展潜力的基础领域。在蛋白质研究背景下,迫切需要揭示蛋白质功能并理清复杂的机制基础。由于实验研究成本高昂且通量有限,计算模型为加速蛋白质功能注释提供了一种有前景的替代方法。近年来,蛋白质预训练模型在多个预测任务中取得了显著进展。这一进展凸显了有效解决与蛋白质功能预测相关的复杂下游任务的显著前景。在本综述中,我们阐述了预测蛋白质功能的计算方法的历史演变和研究范式。随后,我们总结了蛋白质和分子表示以及特征提取技术方面的进展。此外,我们评估了基于机器学习的算法在蛋白质功能预测中各种目标上的性能,从而全面呈现该领域的进展情况。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2089/10808870/9dc42c62fe45/j_mr-2023-0038_fig_001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索