利用相互作用原子的三维概率密度分布预测蛋白质表面的碳水化合物结合位点。

Prediction of carbohydrate binding sites on protein surfaces with 3-dimensional probability density distributions of interacting atoms.

机构信息

Genomics Research Center, Academia Sinica, Taipei, Taiwan.

出版信息

PLoS One. 2012;7(7):e40846. doi: 10.1371/journal.pone.0040846. Epub 2012 Jul 25.

DOI:10.1371/journal.pone.0040846

PMID:22848404

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3405063/

Abstract

Non-covalent protein-carbohydrate interactions mediate molecular targeting in many biological processes. Prediction of non-covalent carbohydrate binding sites on protein surfaces not only provides insights into the functions of the query proteins; information on key carbohydrate-binding residues could suggest site-directed mutagenesis experiments, design therapeutics targeting carbohydrate-binding proteins, and provide guidance in engineering protein-carbohydrate interactions. In this work, we show that non-covalent carbohydrate binding sites on protein surfaces can be predicted with relatively high accuracy when the query protein structures are known. The prediction capabilities were based on a novel encoding scheme of the three-dimensional probability density maps describing the distributions of 36 non-covalent interacting atom types around protein surfaces. One machine learning model was trained for each of the 30 protein atom types. The machine learning algorithms predicted tentative carbohydrate binding sites on query proteins by recognizing the characteristic interacting atom distribution patterns specific for carbohydrate binding sites from known protein structures. The prediction results for all protein atom types were integrated into surface patches as tentative carbohydrate binding sites based on normalized prediction confidence level. The prediction capabilities of the predictors were benchmarked by a 10-fold cross validation on 497 non-redundant proteins with known carbohydrate binding sites. The predictors were further tested on an independent test set with 108 proteins. The residue-based Matthews correlation coefficient (MCC) for the independent test was 0.45, with prediction precision and sensitivity (or recall) of 0.45 and 0.49 respectively. In addition, 111 unbound carbohydrate-binding protein structures for which the structures were determined in the absence of the carbohydrate ligands were predicted with the trained predictors. The overall prediction MCC was 0.49. Independent tests on anti-carbohydrate antibodies showed that the carbohydrate antigen binding sites were predicted with comparable accuracy. These results demonstrate that the predictors are among the best in carbohydrate binding site predictions to date.

摘要

非共价蛋白质-碳水化合物相互作用介导许多生物过程中的分子靶向。预测蛋白质表面上的非共价碳水化合物结合位点不仅可以深入了解查询蛋白质的功能；有关关键碳水化合物结合残基的信息可以提示针对碳水化合物结合蛋白的定点诱变实验、设计针对碳水化合物结合蛋白的治疗方法，并为工程蛋白-碳水化合物相互作用提供指导。在这项工作中，我们表明，当查询蛋白质结构已知时，可以相对准确地预测蛋白质表面上的非共价碳水化合物结合位点。预测能力基于描述围绕蛋白质表面分布的 36 种非共价相互作用原子类型的三维概率密度图的新型编码方案。为每种蛋白质原子类型训练了一个机器学习模型。机器学习算法通过识别特定于碳水化合物结合位点的特征相互作用原子分布模式，从已知蛋白质结构中预测候选蛋白质上的暂定碳水化合物结合位点。基于归一化预测置信度水平，将所有蛋白质原子类型的预测结果整合到表面斑块中作为暂定碳水化合物结合位点。在 497 个具有已知碳水化合物结合位点的非冗余蛋白质上进行了 10 倍交叉验证，对预测器的预测能力进行了基准测试。进一步在 108 个独立测试集上测试了预测器。独立测试的基于残基的马修斯相关系数 (MCC) 为 0.45，预测精度和敏感性（或召回率）分别为 0.45 和 0.49。此外，还使用训练有素的预测器预测了 111 个未结合碳水化合物结合蛋白结构，这些结构是在没有碳水化合物配体的情况下确定的。总体预测 MCC 为 0.49。对抗碳水化合物抗体的独立测试表明，碳水化合物抗原结合位点的预测具有相当的准确性。这些结果表明，这些预测器是迄今为止最好的碳水化合物结合位点预测器之一。

相似文献

Prediction of carbohydrate binding sites on protein surfaces with 3-dimensional probability density distributions of interacting atoms.利用相互作用原子的三维概率密度分布预测蛋白质表面的碳水化合物结合位点。

PLoS One. 2012;7(7):e40846. doi: 10.1371/journal.pone.0040846. Epub 2012 Jul 25.

Prediction of FMN-binding residues with three-dimensional probability distributions of interacting atoms on protein surfaces.基于蛋白质表面相互作用原子的三维概率分布预测 FMN 结合残基。

J Theor Biol. 2014 Feb 21;343:154-61. doi: 10.1016/j.jtbi.2013.10.020. Epub 2013 Nov 7.

Protein-protein interaction site predictions with three-dimensional probability distributions of interacting atoms on protein surfaces.利用蛋白质表面相互作用原子的三维概率分布预测蛋白质-蛋白质相互作用位点。

PLoS One. 2012;7(6):e37706. doi: 10.1371/journal.pone.0037706. Epub 2012 Jun 6.

Predicting Ligand Binding Sites on Protein Surfaces by 3-Dimensional Probability Density Distributions of Interacting Atoms.通过相互作用原子的三维概率密度分布预测蛋白质表面的配体结合位点。

PLoS One. 2016 Aug 11;11(8):e0160315. doi: 10.1371/journal.pone.0160315. eCollection 2016.

Prediction of fatty acid-binding residues on protein surfaces with three-dimensional probability distributions of interacting atoms.利用相互作用原子的三维概率分布预测蛋白质表面的脂肪酸结合残基。

Biophys Chem. 2014 Aug;192:10-9. doi: 10.1016/j.bpc.2014.05.002. Epub 2014 May 29.

Analysis and prediction of carbohydrate binding sites.碳水化合物结合位点的分析与预测

Protein Eng. 2000 Feb;13(2):89-98. doi: 10.1093/protein/13.2.89.

Structure-based prediction of protein- peptide binding regions using Random Forest.基于结构的随机森林预测蛋白肽结合区域。

Bioinformatics. 2018 Feb 1;34(3):477-484. doi: 10.1093/bioinformatics/btx614.

Sequence and structural features of carbohydrate binding in proteins and assessment of predictability using a neural network.蛋白质中碳水化合物结合的序列和结构特征以及使用神经网络评估可预测性

BMC Struct Biol. 2007 Jan 3;7:1. doi: 10.1186/1472-6807-7-1.

Protein-RNA interface residue prediction using machine learning: an assessment of the state of the art.基于机器学习的蛋白质-RNA 界面残基预测：现状评估。

BMC Bioinformatics. 2012 May 10;13:89. doi: 10.1186/1471-2105-13-89.

Sequence-Based Prediction of Protein-Carbohydrate Binding Sites Using Support Vector Machines.使用支持向量机基于序列预测蛋白质-碳水化合物结合位点

J Chem Inf Model. 2016 Oct 24;56(10):2115-2122. doi: 10.1021/acs.jcim.6b00320. Epub 2016 Sep 22.

引用本文的文献

Highly accurate carbohydrate-binding site prediction with DeepGlycanSite.利用 DeepGlycanSite 进行高精度糖基结合位点预测。

Nat Commun. 2024 Jun 17;15(1):5163. doi: 10.1038/s41467-024-49516-2.

Structure-based neural network protein-carbohydrate interaction predictions at the residue level.基于结构的神经网络在残基水平上对蛋白质-碳水化合物相互作用的预测。

Front Bioinform. 2023 Jun 20;3:1186531. doi: 10.3389/fbinf.2023.1186531. eCollection 2023.

Structure-Based Neural Network Protein-Carbohydrate Interaction Predictions at the Residue Level.基于结构的神经网络在残基水平上预测蛋白质-碳水化合物相互作用

bioRxiv. 2023 Mar 15:2023.03.14.531382. doi: 10.1101/2023.03.14.531382.

Computational Analysis of Antibody Paratopes for Antibody Sequences in Antibody Libraries.抗体文库中抗体序列的抗体变区计算分析。

Methods Mol Biol. 2023;2552:437-445. doi: 10.1007/978-1-0716-2609-2_24.

Effective binding to protein antigens by antibodies from antibody libraries designed with enhanced protein recognition propensities.抗体库设计中增强的蛋白质识别倾向可使抗体有效结合蛋白质抗原。

MAbs. 2019 Feb/Mar;11(2):373-387. doi: 10.1080/19420862.2018.1550320. Epub 2019 Jan 9.

Insights into the effects of glycosylation and the monosaccharide-binding activity of the plant lectin CrataBL.深入了解糖基化作用和植物凝集素 CrataBL 的单糖结合活性的影响。

Glycoconj J. 2017 Aug;34(4):515-522. doi: 10.1007/s10719-017-9766-7. Epub 2017 Mar 15.

PLoS One. 2016 Aug 11;11(8):e0160315. doi: 10.1371/journal.pone.0160315. eCollection 2016.

Crystal structure of Streptococcus pneumoniae pneumolysin provides key insights into early steps of pore formation.肺炎链球菌溶血素的晶体结构为孔形成的早期步骤提供了关键见解。

Sci Rep. 2015 Sep 25;5:14352. doi: 10.1038/srep14352.

The cholesterol-dependent cytolysins pneumolysin and streptolysin O require binding to red blood cell glycans for hemolytic activity.胆固醇依赖性细胞溶素肺炎球菌溶血素和链球菌溶血素O的溶血活性需要与红细胞聚糖结合。

Proc Natl Acad Sci U S A. 2014 Dec 9;111(49):E5312-20. doi: 10.1073/pnas.1412703111. Epub 2014 Nov 24.

本文引用的文献

Rationalization and design of the complementarity determining region sequences in an antibody-antigen recognition interface.抗体-抗原识别界面中互补决定区序列的合理化设计。

PLoS One. 2012;7(3):e33340. doi: 10.1371/journal.pone.0033340. Epub 2012 Mar 22.

InCa-SiteFinder: a method for structure-based prediction of inositol and carbohydrate binding sites on proteins.InCa-位点发现者：一种基于结构预测蛋白质上肌醇和碳水化合物结合位点的方法。

J Mol Graph Model. 2009 Oct;28(3):297-303. doi: 10.1016/j.jmgm.2009.08.009. Epub 2009 Aug 27.

Prediction of protein-glucose binding sites using support vector machines.使用支持向量机预测蛋白质-葡萄糖结合位点。

Proteins. 2009 Oct;77(1):121-32. doi: 10.1002/prot.22424.

The prospects of glycan biomarkers for the diagnosis of diseases.聚糖生物标志物在疾病诊断中的前景。

Mol Biosyst. 2009 Jan;5(1):17-20. doi: 10.1039/b811781k. Epub 2008 Nov 6.

Protease substrate site predictors derived from machine learning on multilevel substrate phage display data.基于多级底物噬菌体展示数据通过机器学习得出的蛋白酶底物位点预测器。

Bioinformatics. 2008 Dec 1;24(23):2691-7. doi: 10.1093/bioinformatics/btn538. Epub 2008 Oct 29.

BMC Struct Biol. 2007 Jan 3;7:1. doi: 10.1186/1472-6807-7-1.

An empirical approach for structure-based prediction of carbohydrate-binding sites on proteins.一种基于结构预测蛋白质上碳水化合物结合位点的经验方法。

Protein Eng. 2003 Jul;16(7):467-78. doi: 10.1093/protein/gzg065.

Discrimination of native protein structures using atom-atom contact scoring.利用原子-原子接触评分法鉴别天然蛋白质结构

Proc Natl Acad Sci U S A. 2003 Mar 18;100(6):3215-20. doi: 10.1073/pnas.0535768100. Epub 2003 Mar 11.

Analysis and prediction of carbohydrate binding sites.碳水化合物结合位点的分析与预测

Protein Eng. 2000 Feb;13(2):89-98. doi: 10.1093/protein/13.2.89.

Structural basis of lectin-carbohydrate recognition.凝集素-碳水化合物识别的结构基础。

Annu Rev Biochem. 1996;65:441-73. doi: 10.1146/annurev.bi.65.070196.002301.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

利用相互作用原子的三维概率密度分布预测蛋白质表面的碳水化合物结合位点。

Prediction of carbohydrate binding sites on protein surfaces with 3-dimensional probability density distributions of interacting atoms.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献