通过整合多序列比对（MSA）技术的格罗莫夫-瓦瑟斯坦自动编码器进行酶序列优化。

Enzyme sequence optimisation via Gromov-Wasserstein Autoencoders integrating MSA techniques.

作者信息

Wang Xuze, Li Yangyang, Hou Xiancong, Liu Hao

机构信息

College of Computer Science and Technology, Ocean University of China, Qingdao, China.

出版信息

J Enzyme Inhib Med Chem. 2025 Dec;40(1):2524742. doi: 10.1080/14756366.2025.2524742. Epub 2025 Jul 3.

DOI:10.1080/14756366.2025.2524742

PMID:40607666

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12231317/

Abstract

Enzyme sequence design has always been a challenging task, particularly in optimising key properties such as enzyme solubility, stability, and activity. This study proposes an innovative approach by utilising a variational autoencoder (VAE) model integrated with the Gromov-Wasserstein (GW) distance for enzyme sequence optimisation. The GWAE model improves representation learning by using the GW distance, thereby generating functional variants with desired characteristics. We also introduce an innovative enzyme dataset construction method that incorporates multiple sequence alignment (MSA) techniques to address sequence length discrepancies, enhancing the accuracy of the optimisation process. Experimental results show that the GWAE model outperforms the traditional VAE on multiple metrics. The generated enzyme sequences demonstrate superior solubility, stability, and hydrophobicity. Additionally, by integrating AlphaFold3 for structural prediction, we verify the structural stability of the generated sequences, further enhancing their practical applicability.

摘要

酶序列设计一直是一项具有挑战性的任务，特别是在优化诸如酶的溶解度、稳定性和活性等关键特性方面。本研究提出了一种创新方法，即利用变分自编码器（VAE）模型与格罗莫夫-瓦瑟斯坦（GW）距离相结合来优化酶序列。GWAE模型通过使用GW距离改进了表示学习，从而生成具有所需特性的功能变体。我们还引入了一种创新的酶数据集构建方法，该方法结合了多序列比对（MSA）技术来解决序列长度差异问题，提高了优化过程的准确性。实验结果表明，GWAE模型在多个指标上优于传统的VAE。生成的酶序列表现出优异的溶解度、稳定性和疏水性。此外，通过整合AlphaFold3进行结构预测，我们验证了生成序列的结构稳定性，进一步提高了它们的实际适用性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/912a/12231317/f982ef8bd81b/IENZ_A_2524742_F0001_C.jpg

相似文献

Enzyme sequence optimisation via Gromov-Wasserstein Autoencoders integrating MSA techniques.通过整合多序列比对（MSA）技术的格罗莫夫-瓦瑟斯坦自动编码器进行酶序列优化。

J Enzyme Inhib Med Chem. 2025 Dec;40(1):2524742. doi: 10.1080/14756366.2025.2524742. Epub 2025 Jul 3.

Enhancing microbe-disease association prediction via multi-view graph convolution and latent feature learning.通过多视图图卷积和潜在特征学习增强微生物-疾病关联预测

Comput Biol Chem. 2025 Jun 30;119:108581. doi: 10.1016/j.compbiolchem.2025.108581.

Disentangled global and local features of multi-source data variational autoencoder: An interpretable model for diagnosing IgAN via multi-source Raman spectral fusion techniques.多源数据变分自编码器的全局和局部特征解缠：一种通过多源拉曼光谱融合技术诊断IgA肾病的可解释模型。

Artif Intell Med. 2025 Feb;160:103053. doi: 10.1016/j.artmed.2024.103053. Epub 2024 Dec 12.

Synergizing Attribute-Guided Latent Space Exploration (AGLSE) with Classical Molecular Simulations to Design Potent Pep-Magnet Peptide Inhibitors to Abrogate SARS-CoV-2 Host Cell Entry.将属性引导的潜在空间探索（AGLSE）与经典分子模拟相结合，以设计有效的 Pep-Magnet 肽抑制剂来阻断 SARS-CoV-2 进入宿主细胞。

Viruses. 2025 Jun 7;17(6):828. doi: 10.3390/v17060828.

Management of urinary stones by experts in stone disease (ESD 2025).结石病专家对尿路结石的管理（2025年结石病专家共识）

Arch Ital Urol Androl. 2025 Jun 30;97(2):14085. doi: 10.4081/aiua.2025.14085.

Signs and symptoms to determine if a patient presenting in primary care or hospital outpatient settings has COVID-19.在基层医疗机构或医院门诊环境中，如果患者出现以下症状和体征，可判断其是否患有 COVID-19。

Cochrane Database Syst Rev. 2022 May 20;5(5):CD013665. doi: 10.1002/14651858.CD013665.pub3.

A sequential recommendation method using contrastive learning and Wasserstein self-attention mechanism.一种使用对比学习和瓦瑟斯坦自注意力机制的序列推荐方法。

PeerJ Comput Sci. 2025 Mar 26;11:e2749. doi: 10.7717/peerj-cs.2749. eCollection 2025.

Comparison of Two Modern Survival Prediction Tools, SORG-MLA and METSSS, in Patients With Symptomatic Long-bone Metastases Who Underwent Local Treatment With Surgery Followed by Radiotherapy and With Radiotherapy Alone.两种现代生存预测工具 SORG-MLA 和 METSSS 在接受手术联合放疗和单纯放疗治疗有症状长骨转移患者中的比较。

Clin Orthop Relat Res. 2024 Dec 1;482(12):2193-2208. doi: 10.1097/CORR.0000000000003185. Epub 2024 Jul 23.

The Black Book of Psychotropic Dosing and Monitoring.《精神药物剂量与监测黑皮书》

Psychopharmacol Bull. 2024 Jul 8;54(3):8-59.

[Preparation and chromatographic performance evaluation of hydrophilic interaction chromatography stationary phase based on amino acids].基于氨基酸的亲水作用色谱固定相的制备及色谱性能评价

Se Pu. 2025 Jul;43(7):734-743. doi: 10.3724/SP.J.1123.2025.04015.

本文引用的文献

Unlocking the potential of enzyme engineering via rational computational design strategies.通过合理的计算设计策略挖掘酶工程的潜力。

Biotechnol Adv. 2024 Jul-Aug;73:108376. doi: 10.1016/j.biotechadv.2024.108376. Epub 2024 May 11.

Property-Guided Few-Shot Learning for Molecular Property Prediction With Dual-View Encoder and Relation Graph Learning Network.基于双视图编码器和关系图学习网络的属性引导少样本学习用于分子属性预测

IEEE J Biomed Health Inform. 2025 Mar;29(3):1747-1758. doi: 10.1109/JBHI.2024.3381896. Epub 2025 Mar 6.

Illuminating enzyme design using deep learning.利用深度学习进行酶设计的研究

Nat Chem. 2023 Jun;15(6):749-750. doi: 10.1038/s41557-023-01218-w.

An Overview of Deep Generative Models in Functional and Evolutionary Genomics.深度生成模型在功能和进化基因组学中的概述。

Annu Rev Biomed Data Sci. 2023 Aug 10;6:173-189. doi: 10.1146/annurev-biodatasci-020722-115651. Epub 2023 May 3.

Using AlphaFold to predict the impact of single mutations on protein stability and function.利用 AlphaFold 预测单突变对蛋白质稳定性和功能的影响。

PLoS One. 2023 Mar 16;18(3):e0282689. doi: 10.1371/journal.pone.0282689. eCollection 2023.

Deep learning methods for molecular representation and property prediction.深度学习方法在分子表示和性质预测中的应用。

Drug Discov Today. 2022 Dec;27(12):103373. doi: 10.1016/j.drudis.2022.103373. Epub 2022 Sep 24.

Robust deep learning-based protein sequence design using ProteinMPNN.使用 ProteinMPNN 进行健壮的基于深度学习的蛋白质序列设计。

Science. 2022 Oct 7;378(6615):49-56. doi: 10.1126/science.add2187. Epub 2022 Sep 15.

Developments in Algorithms for Sequence Alignment: A Review.序列比对算法的发展：综述。

Biomolecules. 2022 Apr 6;12(4):546. doi: 10.3390/biom12040546.

Fast and Flexible Protein Design Using Deep Graph Neural Networks.利用深度图神经网络实现快速灵活的蛋白质设计。

Cell Syst. 2020 Oct 21;11(4):402-411.e4. doi: 10.1016/j.cels.2020.08.016. Epub 2020 Sep 23.

Improving protein solubility and activity by introducing small peptide tags designed with machine learning models.通过引入利用机器学习模型设计的小肽标签来提高蛋白质的溶解度和活性。

Metab Eng Commun. 2020 Jun 22;11:e00138. doi: 10.1016/j.mec.2020.e00138. eCollection 2020 Dec.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

通过整合多序列比对（MSA）技术的格罗莫夫-瓦瑟斯坦自动编码器进行酶序列优化。

Enzyme sequence optimisation via Gromov-Wasserstein Autoencoders integrating MSA techniques.

作者信息

机构信息

出版信息

相似文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

本文引用的文献