Suppr超能文献

基于卷积神经网络对已知酶进行排序,作为重新设计新型底物活性的起点。

Rank-ordering of known enzymes as starting points for re-engineering novel substrate activity using a convolutional neural network.

机构信息

Department of Chemical Engineering, The Pennsylvania State University, University Park, PA, 16802, USA.

Department of Chemical Engineering, The Pennsylvania State University, University Park, PA, 16802, USA.

出版信息

Metab Eng. 2023 Jul;78:171-182. doi: 10.1016/j.ymben.2023.06.001. Epub 2023 Jun 8.

Abstract

Retro-biosynthetic approaches have made significant advances in predicting synthesis routes of target biofuel, bio-renewable or bio-active molecules. The use of only cataloged enzymatic activities limits the discovery of new production routes. Recent retro-biosynthetic algorithms increasingly use novel conversions that require altering the substrate or cofactor specificities of existing enzymes while connecting pathways leading to a target metabolite. However, identifying and re-engineering enzymes for desired novel conversions are currently the bottlenecks in implementing such designed pathways. Herein, we present EnzRank, a convolutional neural network (CNN) based approach, to rank-order existing enzymes in terms of their suitability to undergo successful protein engineering through directed evolution or de novo design towards a desired specific substrate activity. We train the CNN model on 11,800 known active enzyme-substrate pairs from the BRENDA database as positive samples and data generated by scrambling these pairs as negative samples using substrate dissimilarity between an enzyme's native substrate and all other molecules present in the dataset using Tanimoto similarity score. EnzRank achieves an average recovery rate of 80.72% and 73.08% for positive and negative pairs on test data after using a 10-fold holdout method for training and cross-validation. We further developed a web-based user interface (available at https://huggingface.co/spaces/vuu10/EnzRank) to predict enzyme-substrate activity using SMILES strings of substrates and enzyme sequence as input to allow convenient and easy-to-use access to EnzRank. In summary, this effort can aid de novo pathway design tools to prioritize starting enzyme re-engineering candidates for novel reactions as well as in predicting the potential secondary activity of enzymes in cell metabolism.

摘要

回溯生物合成方法在预测目标生物燃料、生物可再生或生物活性分子的合成途径方面取得了重大进展。仅使用编目酶活性限制了新生产途径的发现。最近的回溯生物合成算法越来越多地使用新颖的转化,这些转化需要改变现有酶的底物或辅因子特异性,同时连接通向目标代谢物的途径。然而,识别和工程化用于所需新颖转化的酶目前是实施此类设计途径的瓶颈。在此,我们提出了 EnzRank,这是一种基于卷积神经网络 (CNN) 的方法,用于根据其通过定向进化或从头设计成功进行蛋白质工程的适宜性对现有酶进行排序,以获得所需的特定底物活性。我们使用 BRENDA 数据库中 11800 个已知的活性酶-底物对作为正样本,使用酶的天然底物和数据集中存在的所有其他分子之间的底物相似性(使用 Tanimoto 相似性得分)作为负样本,对该 CNN 模型进行训练。EnzRank 在使用 10 折留一法进行训练和交叉验证后,在测试数据上对正样本和负样本的平均恢复率分别为 80.72%和 73.08%。我们进一步开发了一个基于网络的用户界面(可在 https://huggingface.co/spaces/vuu10/EnzRank 上获得),该界面使用底物的 SMILES 字符串和酶序列作为输入来预测酶-底物活性,以便方便、易用地访问 EnzRank。总之,这项工作可以帮助从头设计途径工具优先选择用于新反应的起始酶工程候选物,并预测细胞代谢中酶的潜在次要活性。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验