Suppr超能文献

LM-GVP:一个可扩展的序列和结构信息深度学习框架,用于蛋白质性质预测。

LM-GVP: an extensible sequence and structure informed deep learning framework for protein property prediction.

机构信息

Amazon Machine Learning Solutions Lab, Amazon Web Services, Santa Clara, CA, USA.

Janssen Biotherapeutics, The Janssen Pharmaceutical Companies of Johnson & Johnson, Spring House, PA, USA.

出版信息

Sci Rep. 2022 Apr 27;12(1):6832. doi: 10.1038/s41598-022-10775-y.

Abstract

Proteins perform many essential functions in biological systems and can be successfully developed as bio-therapeutics. It is invaluable to be able to predict their properties based on a proposed sequence and structure. In this study, we developed a novel generalizable deep learning framework, LM-GVP, composed of a protein Language Model (LM) and Graph Neural Network (GNN) to leverage information from both 1D amino acid sequences and 3D structures of proteins. Our approach outperformed the state-of-the-art protein LMs on a variety of property prediction tasks including fluorescence, protease stability, and protein functions from Gene Ontology (GO). We also illustrated insights into how a GNN prediction head can inform the fine-tuning of protein LMs to better leverage structural information. We envision that our deep learning framework will be generalizable to many protein property prediction problems to greatly accelerate protein engineering and drug development.

摘要

蛋白质在生物系统中执行许多重要功能,并且可以成功地开发为生物治疗药物。能够根据提议的序列和结构来预测它们的性质是非常宝贵的。在这项研究中,我们开发了一种新颖的可推广的深度学习框架 LM-GVP,它由蛋白质语言模型 (LM) 和图神经网络 (GNN) 组成,可利用蛋白质的一维氨基酸序列和三维结构中的信息。我们的方法在各种性质预测任务(包括荧光、蛋白酶稳定性和基因本体论 (GO) 中的蛋白质功能)上均优于最先进的蛋白质 LM。我们还说明了如何通过 GNN 预测头来告知蛋白质 LM 的微调,以更好地利用结构信息。我们设想我们的深度学习框架将可推广到许多蛋白质性质预测问题,从而极大地加速蛋白质工程和药物开发。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4da1/9046255/c211b31d30eb/41598_2022_10775_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验