Suppr超能文献

通过可解释的图变分自编码器生成三级蛋白质结构。

Generating tertiary protein structures via interpretable graph variational autoencoders.

作者信息

Guo Xiaojie, Du Yuanqi, Tadepalli Sivani, Zhao Liang, Shehu Amarda

机构信息

Department of Information Sciences and Technology, George Mason University, Fairfax, VA 22030, USA.

Department of Computer Science, George Mason University, Fairfax, VA 22030, USA.

出版信息

Bioinform Adv. 2021 Nov 29;1(1):vbab036. doi: 10.1093/bioadv/vbab036. eCollection 2021.

Abstract

MOTIVATION

Modeling the structural plasticity of protein molecules remains challenging. Most research has focused on obtaining one biologically active structure. This includes the recent AlphaFold2 that has been hailed as a breakthrough for protein modeling. Computing one structure does not suffice to understand how proteins modulate their interactions and even evade our immune system. Revealing the structure space available to a protein remains challenging. Data-driven approaches that learn to generate tertiary structures are increasingly garnering attention. These approaches exploit the ability to represent tertiary structures as contact or distance maps and make direct analogies with images to harness convolution-based generative adversarial frameworks from computer vision. Since such opportunistic analogies do not allow capturing highly structured data, current deep models struggle to generate physically realistic tertiary structures.

RESULTS

We present novel deep generative models that build upon the graph variational autoencoder framework. In contrast to existing literature, we represent tertiary structures as 'contact' graphs, which allow us to leverage graph-generative deep learning. Our models are able to capture rich, local and distal constraints and additionally compute disentangled latent representations that reveal the impact of individual latent factors. This elucidates what the factors control and makes our models more interpretable. Rigorous comparative evaluation along various metrics shows that the models, we propose advance the state-of-the-art. While there is still much ground to cover, the work presented here is an important first step, and graph-generative frameworks promise to get us to our goal of unraveling the exquisite structural complexity of protein molecules.

AVAILABILITY AND IMPLEMENTATION

Code is available at https://github.com/anonymous1025/CO-VAE.

SUPPLEMENTARY INFORMATION

Supplementary data are available at online.

摘要

动机

对蛋白质分子的结构可塑性进行建模仍然具有挑战性。大多数研究都集中在获得一种生物活性结构上。这包括最近被誉为蛋白质建模突破的AlphaFold2。计算一种结构不足以理解蛋白质如何调节其相互作用,甚至逃避我们的免疫系统。揭示蛋白质可用的结构空间仍然具有挑战性。学习生成三级结构的数据驱动方法越来越受到关注。这些方法利用将三级结构表示为接触图或距离图的能力,并与图像进行直接类比,以利用计算机视觉中基于卷积的生成对抗框架。由于这种机会主义类比不允许捕获高度结构化的数据,当前的深度模型难以生成物理上逼真的三级结构。

结果

我们提出了基于图变分自编码器框架的新型深度生成模型。与现有文献不同,我们将三级结构表示为“接触”图,这使我们能够利用图生成深度学习。我们的模型能够捕获丰富的局部和远程约束,并额外计算解开的潜在表示,揭示各个潜在因素的影响。这阐明了哪些因素起控制作用,使我们的模型更具可解释性。沿各种指标进行的严格比较评估表明,我们提出的模型推动了当前技术水平的发展。虽然仍有许多工作要做,但这里介绍的工作是重要的第一步,图生成框架有望帮助我们实现解开蛋白质分子精细结构复杂性的目标。

可用性和实现

代码可在https://github.com/anonymous1025/CO-VAE获取。

补充信息

补充数据可在网上获取。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/db68/9710582/319681230d55/vbab036f1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验