Suppr超能文献

利用相互作用原子的三维概率密度分布预测蛋白质表面的碳水化合物结合位点。

Prediction of carbohydrate binding sites on protein surfaces with 3-dimensional probability density distributions of interacting atoms.

机构信息

Genomics Research Center, Academia Sinica, Taipei, Taiwan.

出版信息

PLoS One. 2012;7(7):e40846. doi: 10.1371/journal.pone.0040846. Epub 2012 Jul 25.

Abstract

Non-covalent protein-carbohydrate interactions mediate molecular targeting in many biological processes. Prediction of non-covalent carbohydrate binding sites on protein surfaces not only provides insights into the functions of the query proteins; information on key carbohydrate-binding residues could suggest site-directed mutagenesis experiments, design therapeutics targeting carbohydrate-binding proteins, and provide guidance in engineering protein-carbohydrate interactions. In this work, we show that non-covalent carbohydrate binding sites on protein surfaces can be predicted with relatively high accuracy when the query protein structures are known. The prediction capabilities were based on a novel encoding scheme of the three-dimensional probability density maps describing the distributions of 36 non-covalent interacting atom types around protein surfaces. One machine learning model was trained for each of the 30 protein atom types. The machine learning algorithms predicted tentative carbohydrate binding sites on query proteins by recognizing the characteristic interacting atom distribution patterns specific for carbohydrate binding sites from known protein structures. The prediction results for all protein atom types were integrated into surface patches as tentative carbohydrate binding sites based on normalized prediction confidence level. The prediction capabilities of the predictors were benchmarked by a 10-fold cross validation on 497 non-redundant proteins with known carbohydrate binding sites. The predictors were further tested on an independent test set with 108 proteins. The residue-based Matthews correlation coefficient (MCC) for the independent test was 0.45, with prediction precision and sensitivity (or recall) of 0.45 and 0.49 respectively. In addition, 111 unbound carbohydrate-binding protein structures for which the structures were determined in the absence of the carbohydrate ligands were predicted with the trained predictors. The overall prediction MCC was 0.49. Independent tests on anti-carbohydrate antibodies showed that the carbohydrate antigen binding sites were predicted with comparable accuracy. These results demonstrate that the predictors are among the best in carbohydrate binding site predictions to date.

摘要

非共价蛋白质-碳水化合物相互作用介导许多生物过程中的分子靶向。预测蛋白质表面上的非共价碳水化合物结合位点不仅可以深入了解查询蛋白质的功能;有关关键碳水化合物结合残基的信息可以提示针对碳水化合物结合蛋白的定点诱变实验、设计针对碳水化合物结合蛋白的治疗方法,并为工程蛋白-碳水化合物相互作用提供指导。在这项工作中,我们表明,当查询蛋白质结构已知时,可以相对准确地预测蛋白质表面上的非共价碳水化合物结合位点。预测能力基于描述围绕蛋白质表面分布的 36 种非共价相互作用原子类型的三维概率密度图的新型编码方案。为每种蛋白质原子类型训练了一个机器学习模型。机器学习算法通过识别特定于碳水化合物结合位点的特征相互作用原子分布模式,从已知蛋白质结构中预测候选蛋白质上的暂定碳水化合物结合位点。基于归一化预测置信度水平,将所有蛋白质原子类型的预测结果整合到表面斑块中作为暂定碳水化合物结合位点。在 497 个具有已知碳水化合物结合位点的非冗余蛋白质上进行了 10 倍交叉验证,对预测器的预测能力进行了基准测试。进一步在 108 个独立测试集上测试了预测器。独立测试的基于残基的马修斯相关系数 (MCC) 为 0.45,预测精度和敏感性(或召回率)分别为 0.45 和 0.49。此外,还使用训练有素的预测器预测了 111 个未结合碳水化合物结合蛋白结构,这些结构是在没有碳水化合物配体的情况下确定的。总体预测 MCC 为 0.49。对抗碳水化合物抗体的独立测试表明,碳水化合物抗原结合位点的预测具有相当的准确性。这些结果表明,这些预测器是迄今为止最好的碳水化合物结合位点预测器之一。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验