• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

MAHOMES II:一个用于预测金属结合位点是否具有酶活性的网络服务器。

MAHOMES II: A webserver for predicting if a metal binding site is enzymatic.

作者信息

Feehan Ryan, Copeland Matthew, Franklin Meghan W, Slusky Joanna S G

机构信息

Center for Computational Biology, The University of Kansas, 2030 Becker Dr., Lawrence, KS 66047.

Department of Molecular Biosciences, The University of Kansas, 1200 Sunnyside Ave. Lawrence KS 66045-3101.

出版信息

bioRxiv. 2023 Mar 12:2023.03.08.531790. doi: 10.1101/2023.03.08.531790.

DOI:10.1101/2023.03.08.531790
PMID:36945603
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10028950/
Abstract

UNLABELLED

Recent advances have enabled high-quality computationally generated structures for proteins with no solved crystal structures. However, protein function data remains largely limited to experimental methods and homology mapping. Since structure determines function, it is natural that methods capable of using computationally generated structures for functional annotations need to be advanced. Our laboratory recently developed a method to distinguish between metalloenzyme and non-enzyme sites. Here we report improvements to this method by upgrading our physicochemical features to alleviate the need for structures with sub-angstrom precision and using machine learning to reduce training data labeling error. Our improved classifier identifies protein bound metal sites as enzymatic or non-enzymatic with 94% precision and 92% recall. We demonstrate that both adjustments increased predictive performance and reliability on sites with sub-angstrom variations. We constructed a set of predicted metalloprotein structures with no solved crystal structures and no detectable homology to our training data. Our model had an accuracy of 90 - 97.5% depending on the quality of the predicted structures included in our test. Finally, we found the physicochemical trends that drove this model's successful performance were local protein density, second shell ionizable residue burial, and the pocket's accessibility to the site. We anticipate that our model's ability to correctly identify catalytic metal sites could enable identification of new enzymatic mechanisms and improve metalloenzyme design success rates.

SIGNIFICANCE STATEMENT

Identification of enzyme active sites on proteins with unsolved crystallographic structures can accelerate discovery of novel biochemical reactions, which can impact healthcare, industrial processes, and environmental remediation. Our lab has developed an ML tool for predicting sites on computationally generated protein structures as enzymatic and non-enzymatic. We have made our tool available on a webserver, allowing the scientific community to rapidly search previously unknown protein function space.

摘要

未标注

近期的进展使得能够为没有解析出晶体结构的蛋白质生成高质量的计算结构。然而,蛋白质功能数据在很大程度上仍局限于实验方法和同源性映射。由于结构决定功能,因此需要推进能够使用计算生成的结构进行功能注释的方法。我们实验室最近开发了一种区分金属酶和非酶位点的方法。在此,我们报告对该方法的改进,通过升级我们的物理化学特征以减少对亚埃精度结构的需求,并使用机器学习来减少训练数据标记错误。我们改进后的分类器以94%的精度和92%的召回率识别蛋白质结合的金属位点是酶促的还是非酶促的。我们证明这两种调整都提高了对具有亚埃变化的位点的预测性能和可靠性。我们构建了一组没有解析出晶体结构且与我们的训练数据没有可检测同源性的预测金属蛋白结构。根据我们测试中包含的预测结构的质量,我们的模型准确率为90 - 97.5%。最后,我们发现驱动该模型成功的物理化学趋势是局部蛋白质密度、第二壳层可电离残基埋藏以及口袋对该位点的可及性。我们预计我们的模型正确识别催化金属位点的能力能够促成新酶促机制的识别,并提高金属酶设计的成功率。

意义声明

在未解析晶体结构的蛋白质上识别酶活性位点可以加速新型生化反应的发现,这可能会影响医疗保健、工业过程和环境修复。我们实验室开发了一种机器学习工具,用于预测计算生成的蛋白质结构上的位点是酶促的还是非酶促的。我们已将我们的工具发布在网络服务器上,使科学界能够快速搜索以前未知的蛋白质功能空间。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/defc/10028950/266d374cf516/nihpp-2023.03.08.531790v1-f0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/defc/10028950/df8ecc389faf/nihpp-2023.03.08.531790v1-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/defc/10028950/9f480fea0ca2/nihpp-2023.03.08.531790v1-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/defc/10028950/fcc7412775da/nihpp-2023.03.08.531790v1-f0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/defc/10028950/266d374cf516/nihpp-2023.03.08.531790v1-f0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/defc/10028950/df8ecc389faf/nihpp-2023.03.08.531790v1-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/defc/10028950/9f480fea0ca2/nihpp-2023.03.08.531790v1-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/defc/10028950/fcc7412775da/nihpp-2023.03.08.531790v1-f0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/defc/10028950/266d374cf516/nihpp-2023.03.08.531790v1-f0004.jpg

相似文献

1
MAHOMES II: A webserver for predicting if a metal binding site is enzymatic.MAHOMES II:一个用于预测金属结合位点是否具有酶活性的网络服务器。
bioRxiv. 2023 Mar 12:2023.03.08.531790. doi: 10.1101/2023.03.08.531790.
2
MAHOMES II: A webserver for predicting if a metal binding site is enzymatic.MAHOMES II:一个用于预测金属结合位点是否为酶的网络服务器。
Protein Sci. 2023 Apr;32(4):e4626. doi: 10.1002/pro.4626.
3
Machine learning differentiates enzymatic and non-enzymatic metals in proteins.机器学习区分蛋白质中的酶促金属和非酶促金属。
Nat Commun. 2021 Jun 17;12(1):3712. doi: 10.1038/s41467-021-24070-3.
4
Macromolecular crowding: chemistry and physics meet biology (Ascona, Switzerland, 10-14 June 2012).大分子拥挤现象:化学与物理邂逅生物学(瑞士阿斯科纳,2012年6月10日至14日)
Phys Biol. 2013 Aug;10(4):040301. doi: 10.1088/1478-3975/10/4/040301. Epub 2013 Aug 2.
5
6
Noncoded Amino Acids in de Novo Metalloprotein Design: Controlling Coordination Number and Catalysis.从头设计金属蛋白中的非编码氨基酸:控制配位数和催化。
Acc Chem Res. 2019 May 21;52(5):1160-1167. doi: 10.1021/acs.accounts.9b00032. Epub 2019 Apr 1.
7
Erratum: Eyestalk Ablation to Increase Ovarian Maturation in Mud Crabs.勘误:切除眼柄以增加泥蟹的卵巢成熟度。
J Vis Exp. 2023 May 26(195). doi: 10.3791/6561.
8
Selective prediction of interaction sites in protein structures with THEMATICS.利用THEMATICS对蛋白质结构中的相互作用位点进行选择性预测。
BMC Bioinformatics. 2007 Apr 9;8:119. doi: 10.1186/1471-2105-8-119.
9
Visualisation of variable binding pockets on protein surfaces by probabilistic analysis of related structure sets.通过对相关结构集合的概率分析来可视化蛋白质表面上的变构结合口袋。
BMC Bioinformatics. 2012 Mar 14;13:39. doi: 10.1186/1471-2105-13-39.
10
Structure-based prediction of protein- peptide binding regions using Random Forest.基于结构的随机森林预测蛋白肽结合区域。
Bioinformatics. 2018 Feb 1;34(3):477-484. doi: 10.1093/bioinformatics/btx614.

本文引用的文献

1
Combinatorial assembly and design of enzymes.酶的组合组装与设计
Science. 2023 Jan 13;379(6628):195-201. doi: 10.1126/science.ade9434. Epub 2023 Jan 12.
2
Annotation of biologically relevant ligands in UniProtKB using ChEBI.使用 ChEBI 对 UniProtKB 中的生物相关配体进行注释。
Bioinformatics. 2023 Jan 1;39(1). doi: 10.1093/bioinformatics/btac793.
3
AlphaFold heralds a data-driven revolution in biology and medicine.AlphaFold预示着生物学和医学领域一场由数据驱动的革命。
Nat Med. 2021 Oct;27(10):1666-1669. doi: 10.1038/s41591-021-01533-0.
4
Highly accurate protein structure prediction for the human proteome.高精准度的人类蛋白质组蛋白结构预测。
Nature. 2021 Aug;596(7873):590-596. doi: 10.1038/s41586-021-03828-1. Epub 2021 Jul 22.
5
Highly accurate protein structure prediction with AlphaFold.利用 AlphaFold 进行高精度蛋白质结构预测。
Nature. 2021 Aug;596(7873):583-589. doi: 10.1038/s41586-021-03819-2. Epub 2021 Jul 15.
6
Amino acid interactions that facilitate enzyme catalysis.促进酶催化的氨基酸相互作用。
J Chem Phys. 2021 May 21;154(19):195101. doi: 10.1063/5.0041156.
7
Machine learning differentiates enzymatic and non-enzymatic metals in proteins.机器学习区分蛋白质中的酶促金属和非酶促金属。
Nat Commun. 2021 Jun 17;12(1):3712. doi: 10.1038/s41467-021-24070-3.
8
Analysis of electrostatic coupling throughout the laboratory evolution of a designed retroaldolase.设计的 retroaldolase 实验室进化过程中静电耦合的分析。
Protein Sci. 2021 Aug;30(8):1617-1627. doi: 10.1002/pro.4099. Epub 2021 May 24.
9
How the Local Environment of Functional Sites Regulates Protein Function.功能位点的局部环境如何调节蛋白质功能。
J Am Chem Soc. 2020 Jun 3;142(22):9861-9871. doi: 10.1021/jacs.0c02430. Epub 2020 May 19.
10
Probing remote residues important for catalysis in Escherichia coli ornithine transcarbamoylase.探究大肠杆菌鸟氨酸转氨甲酰酶中对催化作用重要的远程残基。
PLoS One. 2020 Feb 6;15(2):e0228487. doi: 10.1371/journal.pone.0228487. eCollection 2020.