为基于知识的对接算法的开发和验证建立大量蛋白质-配体PDB复合物。

Setting up a large set of protein-ligand PDB complexes for the development and validation of knowledge-based docking algorithms.

作者信息

Diago Luis A, Morell Persy, Aguilera Longendri, Moreno Ernesto

机构信息

Department of Bioengineering, Faculty of Electrical Engineering, Havana Institute of Technology, Havana 19390, Cuba.

出版信息

BMC Bioinformatics. 2007 Aug 25;8:310. doi: 10.1186/1471-2105-8-310.

DOI:10.1186/1471-2105-8-310

PMID:17718923

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC2008766/

Abstract

BACKGROUND

The number of algorithms available to predict ligand-protein interactions is large and ever-increasing. The number of test cases used to validate these methods is usually small and problem dependent. Recently, several databases have been released for further understanding of protein-ligand interactions, having the Protein Data Bank as backend support. Nevertheless, it appears to be difficult to test docking methods on a large variety of complexes. In this paper we report the development of a new database of protein-ligand complexes tailored for testing of docking algorithms.

METHODS

Using a new definition of molecular contact, small ligands contained in the 2005 PDB edition were identified and processed. The database was enriched in molecular properties. In particular, an automated typing of ligand atoms was performed. A filtering procedure was applied to select a non-redundant dataset of complexes. Data mining was performed to obtain information on the frequencies of different types of atomic contacts. Docking simulations were run with the program DOCK.

RESULTS

We compiled a large database of small ligand-protein complexes, enriched with different calculated properties, that currently contains more than 6000 non-redundant structures. As an example to demonstrate the value of the new database, we derived a new set of chemical matching rules to be used in the context of the program DOCK, based on contact frequencies between ligand atoms and points representing the protein surface, and proved their enhanced efficiency with respect to the default set of rules included in that program.

CONCLUSION

The new database constitutes a valuable resource for the development of knowledge-based docking algorithms and for testing docking programs on large sets of protein-ligand complexes. The new chemical matching rules proposed in this work significantly increase the success rate in DOCKing simulations. The database developed in this work is available at http://cimlcsext.cim.sld.cu:8080/screeningbrowser/.

摘要

背景

可用于预测配体-蛋白质相互作用的算法数量众多且不断增加。用于验证这些方法的测试案例数量通常较少且取决于具体问题。最近，为了进一步理解蛋白质-配体相互作用，已经发布了几个以蛋白质数据库为后端支持的数据库。然而，在大量不同的复合物上测试对接方法似乎很困难。在本文中，我们报告了一个专门为测试对接算法而开发的蛋白质-配体复合物新数据库。

方法

使用分子接触的新定义，识别并处理了2005年蛋白质数据库版本中包含的小分子配体。该数据库在分子特性方面得到了丰富。特别是，对配体原子进行了自动分类。应用了一个过滤程序来选择一个非冗余的复合物数据集。进行了数据挖掘以获取不同类型原子接触频率的信息。使用DOCK程序进行了对接模拟。

结果

我们编制了一个包含不同计算特性的小分子配体-蛋白质复合物大型数据库，目前包含超过6000个非冗余结构。作为展示新数据库价值的一个例子，我们基于配体原子与代表蛋白质表面的点之间的接触频率，推导了一组新的化学匹配规则，用于DOCK程序，并证明了它们相对于该程序中包含的默认规则集具有更高的效率。

结论

新数据库是开发基于知识的对接算法以及在大量蛋白质-配体复合物上测试对接程序的宝贵资源。本文提出的新化学匹配规则显著提高了对接模拟的成功率。这项工作中开发的数据库可在http://cimlcsext.cim.sld.cu:8080/screeningbrowser/获取。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/abbe/2008766/8e03c1cc9b69/1471-2105-8-310-1.jpg

相似文献

Setting up a large set of protein-ligand PDB complexes for the development and validation of knowledge-based docking algorithms.

BMC Bioinformatics. 2007 Aug 25;8:310. doi: 10.1186/1471-2105-8-310.

A general approach for developing system-specific functions to score protein-ligand docked complexes using support vector inductive logic programming.

Proteins. 2007 Dec 1;69(4):823-31. doi: 10.1002/prot.21782.

DOCKGROUND system of databases for protein recognition studies: unbound structures for docking.

Proteins. 2007 Dec 1;69(4):845-51. doi: 10.1002/prot.21714.

A new test set for validating predictions of protein-ligand interaction.

Proteins. 2002 Dec 1;49(4):457-71. doi: 10.1002/prot.10232.

Development and validation of a modular, extensible docking program: DOCK 5.

J Comput Aided Mol Des. 2006 Oct-Nov;20(10-11):601-19. doi: 10.1007/s10822-006-9060-4. Epub 2006 Dec 6.

Building a structured PDB: the RS-PDB database.

Conf Proc IEEE Eng Med Biol Soc. 2006;2006:5755-8. doi: 10.1109/IEMBS.2006.259331.

PLIC: protein-ligand interaction clusters.

Database (Oxford). 2014 Apr 23;2014(0):bau029. doi: 10.1093/database/bau029. Print 2014.

A fast protein-ligand docking algorithm based on hydrogen bond matching and surface shape complementarity.

J Mol Model. 2010 May;16(5):903-13. doi: 10.1007/s00894-009-0598-7. Epub 2009 Oct 13.

istar: a web platform for large-scale protein-ligand docking.

PLoS One. 2014 Jan 24;9(1):e85678. doi: 10.1371/journal.pone.0085678. eCollection 2014.

Recovering the true targets of specific ligands by virtual screening of the protein data bank.

Proteins. 2004 Mar 1;54(4):671-80. doi: 10.1002/prot.10625.

引用本文的文献

CSAR benchmark exercise 2011-2012: evaluation of results from docking and relative ranking of blinded congeneric series.

J Chem Inf Model. 2013 Aug 26;53(8):1853-70. doi: 10.1021/ci400025f. Epub 2013 May 10.

FReDoWS: a method to automate molecular docking simulations with explicit receptor flexibility and snapshots selection.

BMC Genomics. 2011 Dec 22;12 Suppl 4(Suppl 4):S6. doi: 10.1186/1471-2164-12-S4-S6.

A comparative structural bioinformatics analysis of inherited mutations in β-D-Mannosidase across multiple species reveals a genotype-phenotype correlation.

BMC Genomics. 2011 Nov 30;12 Suppl 3(Suppl 3):S22. doi: 10.1186/1471-2164-12-S3-S22.

本文引用的文献

Prediction of protein-ligand interactions. Docking and scoring: successes and gaps.

J Med Chem. 2006 Oct 5;49(20):5851-5. doi: 10.1021/jm060999m.

Comparison of protein active site structures for functional annotation of proteins and drug design.

Proteins. 2006 Oct 1;65(1):124-35. doi: 10.1002/prot.21092.

Protein-ligand docking: current status and future challenges.

Proteins. 2006 Oct 1;65(1):15-26. doi: 10.1002/prot.21082.

sc-PDB: an annotated database of druggable binding sites from the Protein Data Bank.

J Chem Inf Model. 2006 Mar-Apr;46(2):717-27. doi: 10.1021/ci050372x.

SitesBase: a database for structure-based protein-ligand binding site comparisons.

Nucleic Acids Res. 2006 Jan 1;34(Database issue):D231-4. doi: 10.1093/nar/gkj062.

Exploring protein-ligand recognition with Binding MOAD.

J Mol Graph Model. 2006 May;24(6):414-25. doi: 10.1016/j.jmgm.2005.08.002. Epub 2005 Sep 15.

Comparing protein-ligand docking programs is difficult.

Proteins. 2005 Aug 15;60(3):325-32. doi: 10.1002/prot.20497.

PDB-Ligand: a ligand database based on PDB for the automated and customized classification of ligand-binding structures.

Nucleic Acids Res. 2005 Jan 1;33(Database issue):D238-41. doi: 10.1093/nar/gki059.

L/D Protein Ligand Database (PLD): additional understanding of the nature and specificity of protein-ligand complexes.

Bioinformatics. 2003 Sep 22;19(14):1856-7. doi: 10.1093/bioinformatics/btg243.

Utilising structural knowledge in drug design strategies: applications using Relibase.

J Mol Biol. 2003 Feb 14;326(2):621-36. doi: 10.1016/s0022-2836(02)01409-2.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

为基于知识的对接算法的开发和验证建立大量蛋白质-配体PDB复合物。

Setting up a large set of protein-ligand PDB complexes for the development and validation of knowledge-based docking algorithms.

作者信息

机构信息

出版信息

BACKGROUND

METHODS

RESULTS

CONCLUSION

背景

方法

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献