Suppr超能文献

生物结构网络:基于结构的迁移学习网络用于预测生物催化剂功能。

BioStructNet: Structure-Based Network with Transfer Learning for Predicting Biocatalyst Functions.

作者信息

Wang Xiangwen, Zhou Jiahui, Mueller Jane, Quinn Derek, Carvalho Alexandra, Moody Thomas S, Huang Meilan

机构信息

School of Chemistry and Chemical Engineering, Queen's University Belfast, BT9 5AG Belfast, Northern Ireland, U.K.

Department of Biocatalysis and Isotope Chemistry, Almac Sciences, BT63 5QD Craigavon, Northern Ireland, U.K.

出版信息

J Chem Theory Comput. 2025 Jan 14;21(1):474-490. doi: 10.1021/acs.jctc.4c01391. Epub 2024 Dec 20.

Abstract

Enzyme-substrate interactions are essential to both biological processes and industrial applications. Advanced machine learning techniques have significantly accelerated biocatalysis research, revolutionizing the prediction of biocatalytic activities and facilitating the discovery of novel biocatalysts. However, the limited availability of data for specific enzyme functions, such as conversion efficiency and stereoselectivity, presents challenges for prediction accuracy. In this study, we developed BioStructNet, a structure-based deep learning network that integrates both protein and ligand structural data to capture the complexity of enzyme-substrate interactions. Benchmarking studies with different algorithms showed the enhanced predictive accuracy of BioStructNet. To further optimize the prediction accuracy for the small data set, we implemented transfer learning in the framework, training a source model on a large data set and fine-tuning it on a small, function-specific data set, using the CalB data set as a case study. The model performance was validated by comparing the attention heat maps generated by the BioStructNet interaction module with the enzyme-substrate interactions revealed from molecular dynamics simulations of enzyme-substrate complexes. BioStructNet would accelerate the discovery of functional enzymes for industrial use, particularly in cases where the training data sets for machine learning are small.

摘要

酶 - 底物相互作用对生物过程和工业应用都至关重要。先进的机器学习技术显著加速了生物催化研究,彻底改变了生物催化活性的预测,并促进了新型生物催化剂的发现。然而,特定酶功能的数据(如转化效率和立体选择性)可用性有限,这给预测准确性带来了挑战。在本研究中,我们开发了BioStructNet,这是一种基于结构的深度学习网络,它整合了蛋白质和配体结构数据,以捕捉酶 - 底物相互作用的复杂性。使用不同算法进行的基准研究表明BioStructNet具有更高的预测准确性。为了进一步优化小数据集的预测准确性,我们在该框架中实施了迁移学习,以一个大数据集训练源模型,并以功能特定的小数据集对其进行微调,以卡尔酵母脂肪酶(CalB)数据集为例进行研究。通过将BioStructNet相互作用模块生成的注意力热图与酶 - 底物复合物分子动力学模拟揭示的酶 - 底物相互作用进行比较,验证了模型性能。BioStructNet将加速工业用功能酶的发现,特别是在机器学习训练数据集较小的情况下。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/da1d/11736791/4750882d580b/ct4c01391_0001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验