Suppr超能文献

深度学习时代的蛋白质工程。

Protein engineering in the deep learning era.

作者信息

Zhou Bingxin, Tan Yang, Hu Yutong, Zheng Lirong, Zhong Bozitao, Hong Liang

机构信息

Institute of Natural Sciences Shanghai Jiao Tong University Shanghai China.

Shanghai National Center for Applied Mathematics (SJTU center) Shanghai Jiao Tong University Shanghai China.

出版信息

mLife. 2024 Dec 26;3(4):477-491. doi: 10.1002/mlf2.12157. eCollection 2024 Dec.

Abstract

Advances in deep learning have significantly aided protein engineering in addressing challenges in industrial production, healthcare, and environmental sustainability. This review frames frequently researched problems in protein understanding and engineering from the perspective of deep learning. It provides a thorough discussion of representation methods for protein sequences and structures, along with general encoding pipelines that support both pre-training and supervised learning tasks. We summarize state-of-the-art protein language models, geometric deep learning techniques, and the combination of distinct approaches to learning from multi-modal biological data. Additionally, we outline common downstream tasks and relevant benchmark datasets for training and evaluating deep learning models, focusing on satisfying the particular needs of protein engineering applications, such as identifying mutation sites and predicting properties for candidates' virtual screening. This review offers biologists the latest tools for assisting their engineering projects while providing a clear and comprehensive guide for computer scientists to develop more powerful solutions by standardizing problem formulation and consolidating data resources. Future research can foresee a deeper integration of the communities of biology and computer science, unleashing the full potential of deep learning in protein engineering and driving new scientific breakthroughs.

摘要

深度学习的进展显著助力了蛋白质工程应对工业生产、医疗保健和环境可持续性方面的挑战。本综述从深度学习的角度阐述了蛋白质理解与工程中经常研究的问题。它全面讨论了蛋白质序列和结构的表示方法,以及支持预训练和监督学习任务的通用编码流程。我们总结了当前最先进的蛋白质语言模型、几何深度学习技术,以及从多模态生物数据中学习的不同方法的结合。此外,我们概述了用于训练和评估深度学习模型的常见下游任务及相关基准数据集,重点是满足蛋白质工程应用的特定需求,例如识别突变位点和预测用于候选物虚拟筛选的性质。本综述为生物学家提供了协助其工程项目的最新工具,同时为计算机科学家提供了一份清晰全面的指南,通过规范问题表述和整合数据资源来开发更强大的解决方案。未来的研究可以预见生物学和计算机科学领域将实现更深入的融合,释放深度学习在蛋白质工程中的全部潜力,并推动新的科学突破。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1fc2/11685842/46b27818e3b0/MLF2-3-477-g001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验