Suppr超能文献

用语言模型进行原子级蛋白质结构的进化尺度预测。

Evolutionary-scale prediction of atomic-level protein structure with a language model.

机构信息

FAIR, Meta AI, New York, NY, USA.

New York University, New York, NY, USA.

出版信息

Science. 2023 Mar 17;379(6637):1123-1130. doi: 10.1126/science.ade2574. Epub 2023 Mar 16.

Abstract

Recent advances in machine learning have leveraged evolutionary information in multiple sequence alignments to predict protein structure. We demonstrate direct inference of full atomic-level protein structure from primary sequence using a large language model. As language models of protein sequences are scaled up to 15 billion parameters, an atomic-resolution picture of protein structure emerges in the learned representations. This results in an order-of-magnitude acceleration of high-resolution structure prediction, which enables large-scale structural characterization of metagenomic proteins. We apply this capability to construct the ESM Metagenomic Atlas by predicting structures for >617 million metagenomic protein sequences, including >225 million that are predicted with high confidence, which gives a view into the vast breadth and diversity of natural proteins.

摘要

最近机器学习的进展利用了多序列比对中的进化信息来预测蛋白质结构。我们使用大型语言模型展示了从原始序列直接推断全原子级蛋白质结构。随着蛋白质序列语言模型扩展到 150 亿个参数,蛋白质结构的原子分辨率图像在学习的表示中显现出来。这导致了高分辨率结构预测的数量级加速,从而实现了宏基因组蛋白质的大规模结构特征描述。我们应用这种能力通过预测 >6.17 亿个宏基因组蛋白质序列的结构来构建 ESM 宏基因组图谱,包括 >2.25 亿个具有高置信度的预测结构,从而深入了解了天然蛋白质的广泛多样性。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验