Suppr超能文献

一种用于三级蛋白质结构生成的混合变分自编码器和带梯度惩罚的 Wasserstein 生成对抗网络。

A hybrid variational autoencoder and WGAN with gradient penalty for tertiary protein structure generation.

作者信息

Sehsah Aalaa I, Mousa Afaf, Farouk Gamal

机构信息

Department of Computer Science, Faculty of Computers and Information, Kafrelsheikh University, Kafr El Sheikh, 33516, Egypt.

Department of Computer Science, Faculty of Computers and Information, Menoufia University, Shebin El Kom, 32511, Egypt.

出版信息

Sci Rep. 2025 Apr 23;15(1):14191. doi: 10.1038/s41598-025-94747-y.

Abstract

Elucidating the tertiary structure of proteins is important for understanding their functions and interactions. While deep neural networks have advanced the prediction of a protein's native structure from its amino acid sequence, the focus on a single-structure view limits understanding of the dynamic nature of protein molecules. Acquiring a multi-structure view of protein molecules remains a broader challenge in computational structural biology. Alternative representations, such as distance matrices, offer a compact and effective way to explore and generate realistic tertiary protein structures. This paper presents TP-VWGAN, a hybrid model to improve the realism of generating distance matrix representations of tertiary protein structures. The model integrates the probabilistic representation learning of the Variational Autoencoder (VAE) with the realistic data generation strength of the Wasserstein Generative Adversarial Network with Gradient Penalty (WGAN-GP). The main modification of TP-VWGAN is incorporating residual blocks into its VAE architecture to improve its performance. The experimental results show that TP-VWGAN with and without residual blocks outperforms existing methods in generating realistic protein structures, but incorporating residual blocks enhances its ability to capture key structural features. Comparisons also demonstrate that the more accurately a model learns symmetry features in the generated distance matrices, the better it captures key structural features, as demonstrated through benchmarking against existing methods. This work moves us closer to more advanced deep generative models that can explore a broader range of protein structures and be applied to drug design and protein engineering. The code and data are available at https://github.com/aalaa-sehsah/tp-vwgan .

摘要

阐明蛋白质的三级结构对于理解其功能和相互作用至关重要。虽然深度神经网络在从氨基酸序列预测蛋白质的天然结构方面取得了进展,但对单一结构视图的关注限制了对蛋白质分子动态性质的理解。在计算结构生物学中,获取蛋白质分子的多结构视图仍然是一个更大的挑战。诸如距离矩阵等替代表示法提供了一种紧凑而有效的方式来探索和生成逼真的蛋白质三级结构。本文提出了TP-VWGAN,这是一种混合模型,用于提高生成蛋白质三级结构距离矩阵表示的逼真度。该模型将变分自编码器(VAE)的概率表示学习与带梯度惩罚的瓦瑟斯坦生成对抗网络(WGAN-GP)的逼真数据生成能力相结合。TP-VWGAN的主要改进是在其VAE架构中加入了残差块以提高性能。实验结果表明,带有和不带有残差块的TP-VWGAN在生成逼真的蛋白质结构方面优于现有方法,但加入残差块增强了其捕捉关键结构特征的能力。比较还表明,通过与现有方法进行基准测试证明,模型在生成的距离矩阵中学习对称特征越准确,就越能更好地捕捉关键结构特征。这项工作使我们更接近能够探索更广泛蛋白质结构并应用于药物设计和蛋白质工程的更先进的深度生成模型。代码和数据可在https://github.com/aalaa-sehsah/tp-vwgan获取。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1a83/12019360/dc2577642955/41598_2025_94747_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验