Suppr超能文献

结合物理化学数据和小分子晶体学接触倾向来预测蛋白质结合位点中的相互作用。

Combined use of physicochemical data and small-molecule crystallographic contact propensities to predict interactions in protein binding sites.

作者信息

Nissink J Willem M, Taylor Robin

机构信息

Cambridge Crystallographic Data Centre, 12 Union Road, Cambridge CB2 1EZ, UK.

出版信息

Org Biomol Chem. 2004 Nov 21;2(22):3238-49. doi: 10.1039/B405205F. Epub 2004 Aug 27.

Abstract

Knowledge-based methods are a good alternative to force-field-based methods for the analysis of sites of interaction in protein binding cavities. Both the Protein Data Bank (PDB) and the Cambridge Structural Database (CSD) offer a good amount of data on non-covalent interactions. Although different from protein-derived data, small-molecule crystal data from the CSD are worth looking at as they provide a much more abundant and diverse set of intermolecular contacts. CSD data, when properly corrected by use of octanol-water pi values, can be used to predict the type of ligand chemical group most likely to occupy a given position within a protein binding site. Comparison with observed positions of ligand groups shows that the success rates of these predictions vary from 23% to 84%. Often, the group predicted to be most preferred at a given position is similar but not identical to the observed ligand group; if these are considered successes, prediction success rates range from 71% to 94%. Using PDB data, the corresponding rates are 16% to 79%, and 61% to 96%. Specificity of prediction of NH groups is somewhat better when using PDB interaction data, but results of prediction of hydrophobic groups seem worse than those obtained with CSD data. We have analysed the importance of data selection by applying different filters to eliminate unwanted interactions from our knowledge-base. The presence of certain types of interactions can be undesirable if they are unrepresentative of biological situations (contact to solvent molecules in small-molecule crystal structures, secondary crystallographic contacts) or if they are likely to add noise to the data without conveying much new information (long-distance contacts, sparsely-populated data sets). The elimination of solvent contacts was found to have no effect on the prediction of ligand groups in our test set. Both secondary-contact filtering and noise filtering were found to have a clear beneficial effect on predictive ability.

摘要

基于知识的方法是基于力场的方法在分析蛋白质结合腔相互作用位点时的一个很好的替代方法。蛋白质数据库(PDB)和剑桥结构数据库(CSD)都提供了大量关于非共价相互作用的数据。虽然与蛋白质衍生数据不同,但CSD中的小分子晶体数据值得关注,因为它们提供了更为丰富多样的分子间接触。当通过使用正辛醇 - 水π值进行适当校正时,CSD数据可用于预测最有可能占据蛋白质结合位点内给定位置的配体化学基团类型。与配体基团的观察位置进行比较表明,这些预测的成功率在23%至84%之间。通常,预测在给定位置最优先的基团与观察到的配体基团相似但不相同;如果将这些视为成功,则预测成功率在71%至94%之间。使用PDB数据时,相应的比率为16%至79%,以及61%至96%。使用PDB相互作用数据时,NH基团预测的特异性稍好一些,但疏水基团的预测结果似乎比使用CSD数据时更差。我们通过应用不同的过滤器来消除知识库中不需要的相互作用,分析了数据选择的重要性。如果某些类型的相互作用不能代表生物学情况(小分子晶体结构中与溶剂分子的接触、二次晶体学接触),或者它们可能在不传达太多新信息的情况下给数据增加噪声(长距离接触、稀疏数据集),那么它们的存在可能是不理想的。发现在我们的测试集中消除溶剂接触对配体基团的预测没有影响。发现二次接触过滤和噪声过滤对预测能力都有明显的有益影响。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验