• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

人工智能辅助蛋白质工程:从拓扑数据分析到深度蛋白质语言模型

Artificial intelligence-aided protein engineering: from topological data analysis to deep protein language models.

作者信息

Qiu Yuchi, Wei Guo-Wei

机构信息

Department of Mathematics, Michigan State University, East Lansing, 48824, MI, USA.

Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, 48824, MI, USA.

出版信息

ArXiv. 2023 Jul 27:arXiv:2307.14587v1.

PMID:37547662
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10402185/
Abstract

Protein engineering is an emerging field in biotechnology that has the potential to revolutionize various areas, such as antibody design, drug discovery, food security, ecology, and more. However, the mutational space involved is too vast to be handled through experimental means alone. Leveraging accumulative protein databases, machine learning (ML) models, particularly those based on natural language processing (NLP), have considerably expedited protein engineering. Moreover, advances in topological data analysis (TDA) and artificial intelligence-based protein structure prediction, such as AlphaFold2, have made more powerful structure-based ML-assisted protein engineering strategies possible. This review aims to offer a comprehensive, systematic, and indispensable set of methodological components, including TDA and NLP, for protein engineering and to facilitate their future development.

摘要

蛋白质工程是生物技术领域中一个新兴的领域,它有潜力彻底改变各个领域,如抗体设计、药物发现、食品安全、生态学等等。然而,所涉及的突变空间过于庞大,无法仅通过实验手段来处理。利用积累的蛋白质数据库,机器学习(ML)模型,特别是基于自然语言处理(NLP)的模型,极大地加速了蛋白质工程。此外,拓扑数据分析(TDA)和基于人工智能的蛋白质结构预测(如AlphaFold2)的进展,使得更强大的基于结构的ML辅助蛋白质工程策略成为可能。本综述旨在为蛋白质工程提供一套全面、系统且不可或缺的方法组件,包括TDA和NLP,并促进它们未来的发展。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6f4a/10402185/562e665ea166/nihpp-2307.14587v1-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6f4a/10402185/06c69b9e4ba7/nihpp-2307.14587v1-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6f4a/10402185/562e665ea166/nihpp-2307.14587v1-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6f4a/10402185/06c69b9e4ba7/nihpp-2307.14587v1-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6f4a/10402185/562e665ea166/nihpp-2307.14587v1-f0002.jpg

相似文献

1
Artificial intelligence-aided protein engineering: from topological data analysis to deep protein language models.人工智能辅助蛋白质工程:从拓扑数据分析到深度蛋白质语言模型
ArXiv. 2023 Jul 27:arXiv:2307.14587v1.
2
Artificial intelligence-aided protein engineering: from topological data analysis to deep protein language models.人工智能辅助蛋白质工程:从拓扑数据分析到深度蛋白质语言模型。
Brief Bioinform. 2023 Sep 20;24(5). doi: 10.1093/bib/bbad289.
3
Leveraging transformers-based language models in proteome bioinformatics.基于转换器的语言模型在蛋白质组生物信息学中的应用。
Proteomics. 2023 Dec;23(23-24):e2300011. doi: 10.1002/pmic.202300011. Epub 2023 Jun 29.
4
Machine Learning and Natural Language Processing in Mental Health: Systematic Review.机器学习和自然语言处理在心理健康中的应用:系统综述。
J Med Internet Res. 2021 May 4;23(5):e15708. doi: 10.2196/15708.
5
An Empirical Evaluation of Prompting Strategies for Large Language Models in Zero-Shot Clinical Natural Language Processing: Algorithm Development and Validation Study.零样本临床自然语言处理中大型语言模型提示策略的实证评估:算法开发与验证研究
JMIR Med Inform. 2024 Apr 8;12:e55318. doi: 10.2196/55318.
6
Protein Science Meets Artificial Intelligence: A Systematic Review and a Biochemical Meta-Analysis of an Inter-Field.蛋白质科学与人工智能相遇:跨领域的系统评价与生化荟萃分析
Front Bioeng Biotechnol. 2022 Jul 7;10:788300. doi: 10.3389/fbioe.2022.788300. eCollection 2022.
7
Machine Learning and Artificial Intelligence: A Paradigm Shift in Big Data-Driven Drug Design and Discovery.机器学习和人工智能:大数据驱动的药物设计与发现的范式转变。
Curr Top Med Chem. 2022;22(20):1692-1727. doi: 10.2174/1568026622666220701091339.
8
Natural Language Processing Applications in the Clinical Neurosciences: A Machine Learning Augmented Systematic Review.自然语言处理在临床神经科学中的应用:机器学习增强的系统综述。
Acta Neurochir Suppl. 2022;134:277-289. doi: 10.1007/978-3-030-85292-4_32.
9
Transforming epilepsy research: A systematic review on natural language processing applications.转化癫痫研究:自然语言处理应用的系统评价。
Epilepsia. 2023 Feb;64(2):292-305. doi: 10.1111/epi.17474. Epub 2022 Dec 19.
10
Mining Clinical Notes for Physical Rehabilitation Exercise Information: Natural Language Processing Algorithm Development and Validation Study.挖掘临床记录中的物理康复锻炼信息:自然语言处理算法的开发与验证研究
JMIR Med Inform. 2024 Apr 3;12:e52289. doi: 10.2196/52289.

本文引用的文献

1
Single-sequence protein structure prediction using supervised transformer protein language models.使用监督式转换器蛋白质语言模型进行单序列蛋白质结构预测。
Nat Comput Sci. 2022 Dec;2(12):804-814. doi: 10.1038/s43588-022-00373-3. Epub 2022 Dec 19.
2
PERSISTENT PATH LAPLACIAN.持久路径拉普拉斯算子
Found Data Sci. 2023 Mar;5(1):26-55. doi: 10.3934/fods.2022015.
3
Persistent spectral theory-guided protein engineering.持久光谱理论指导的蛋白质工程。
Nat Comput Sci. 2023 Feb;3(2):149-163. doi: 10.1038/s43588-022-00394-y. Epub 2023 Feb 20.
4
Machine learning methods for predicting protein structure from single sequences.基于单序列预测蛋白质结构的机器学习方法。
Curr Opin Struct Biol. 2023 Aug;81:102627. doi: 10.1016/j.sbi.2023.102627. Epub 2023 Jun 13.
5
Evolutionary-scale prediction of atomic-level protein structure with a language model.用语言模型进行原子级蛋白质结构的进化尺度预测。
Science. 2023 Mar 17;379(6637):1123-1130. doi: 10.1126/science.ade2574. Epub 2023 Mar 16.
6
Large language models generate functional protein sequences across diverse families.大型语言模型可生成不同家族的功能性蛋白质序列。
Nat Biotechnol. 2023 Aug;41(8):1099-1106. doi: 10.1038/s41587-022-01618-2. Epub 2023 Jan 26.
7
Structural insights into the elevator-type transport mechanism of a bacterial ZIP metal transporter.细菌 ZIP 金属转运蛋白的提升式转运机制的结构见解。
Nat Commun. 2023 Jan 24;14(1):385. doi: 10.1038/s41467-023-36048-4.
8
Using machine learning to predict the effects and consequences of mutations in proteins.利用机器学习预测蛋白质突变的影响和后果。
Curr Opin Struct Biol. 2023 Feb;78:102518. doi: 10.1016/j.sbi.2022.102518. Epub 2023 Jan 3.
9
Novel machine learning approaches revolutionize protein knowledge.新型机器学习方法彻底改变了蛋白质知识。
Trends Biochem Sci. 2023 Apr;48(4):345-359. doi: 10.1016/j.tibs.2022.11.001. Epub 2022 Dec 9.
10
Persistent Laplacian projected Omicron BA.4 and BA.5 to become new dominating variants.持续的拉普拉斯投影奥密克戎 BA.4 和 BA.5 成为新的优势变体。
Comput Biol Med. 2022 Dec;151(Pt A):106262. doi: 10.1016/j.compbiomed.2022.106262. Epub 2022 Nov 2.