Schreyer Adrian, Blundell Tom
Department of Biochemistry, University of Cambridge, 80 Tennis Court Road, Cambridge CB21GA, UK.
Chem Biol Drug Des. 2009 Feb;73(2):157-67. doi: 10.1111/j.1747-0285.2008.00762.x.
Harnessing data from the growing number of protein-ligand complexes in the Protein Data Bank is an important task in drug discovery. In order to benefit from the abundance of three-dimensional structures, structural data must be integrated with sequence as well as chemical data and the protein-small molecule interactions characterized structurally at the inter-atomic level. In this study, we present CREDO, a new publicly available database of protein-ligand interactions, which represents contacts as structural interaction fingerprints, implements novel features and is completely scriptable through its application programming interface. Features of CREDO include implementation of molecular shape descriptors with ultrafast shape recognition, fragmentation of ligands in the Protein Data Bank, sequence-to-structure mapping and the identification of approved drugs. Selected analyses of these key features are presented to highlight a range of potential applications of CREDO. The CREDO dataset has been released into the public domain together with the application programming interface under a Creative Commons license at http://www-cryst.bioc.cam.ac.uk/credo. We believe that the free availability and numerous features of CREDO database will be useful not only for commercial but also for academia-driven drug discovery programmes.
利用蛋白质数据库中越来越多的蛋白质-配体复合物数据是药物发现中的一项重要任务。为了从丰富的三维结构中获益,结构数据必须与序列以及化学数据整合,并且蛋白质-小分子相互作用要在原子水平上进行结构表征。在本研究中,我们展示了CREDO,一个新的公开可用的蛋白质-配体相互作用数据库,它将接触表示为结构相互作用指纹,实现了新的特性,并且可以通过其应用程序编程接口完全进行脚本编写。CREDO的特性包括使用超快形状识别实现分子形状描述符、蛋白质数据库中配体的碎片化、序列到结构的映射以及已批准药物的识别。对这些关键特性进行了选定的分析,以突出CREDO的一系列潜在应用。CREDO数据集已在知识共享许可下与应用程序编程接口一起发布到公共领域,网址为http://www-cryst.bioc.cam.ac.uk/credo。我们相信,CREDO数据库的免费可用性和众多特性不仅对商业药物发现计划有用,对学术驱动的药物发现计划也将有用。