Harper G, Bravi G S, Pickett S D, Hussain J, Green D V S
GlaxoSmithKline, Gunnels Wood Road, Stevenage SG1 2NY, United Kingdom.
J Chem Inf Comput Sci. 2004 Nov-Dec;44(6):2145-56. doi: 10.1021/ci049860f.
Virtual screening and high-throughput screening are two major components of lead discovery within the pharmaceutical industry. In this paper we describe improvements to previously published methods for similarity searching with reduced graphs, with a particular focus on ligand-based virtual screening, and describe a novel use of reduced graphs in the clustering of high-throughput screening data. Literature methods for reduced graph similarity searching encode the reduced graphs as binary fingerprints, which has a number of issues. In this paper we extend the definition of the reduced graph to include positively and negatively ionizable groups and introduce a new method for measuring the similarity of reduced graphs based on a weighted edit distance. Moving beyond simple similarity searching, we show how more flexible queries can be built using reduced graphs and describe a database system that allows iterative querying with multiple representations. Reduced graphs capture many important features of ligand-receptor interactions and, in conjunction with other whole molecule descriptors, provide an informative way to review HTS data. We describe a novel use of reduced graphs in this context, introducing a method we have termed data-driven clustering, that identifies clusters of molecules represented by a particular whole molecule descriptor and enriched in active compounds.
虚拟筛选和高通量筛选是制药行业中先导化合物发现的两个主要组成部分。在本文中,我们描述了对先前发表的使用简化图进行相似性搜索方法的改进,特别关注基于配体的虚拟筛选,并描述了简化图在高通量筛选数据聚类中的新用途。文献中用于简化图相似性搜索的方法将简化图编码为二进制指纹,这存在一些问题。在本文中,我们扩展了简化图的定义,以包括可正离子化和负离子化的基团,并引入了一种基于加权编辑距离来测量简化图相似性的新方法。除了简单的相似性搜索,我们展示了如何使用简化图构建更灵活的查询,并描述了一个允许使用多种表示进行迭代查询的数据库系统。简化图捕获了配体-受体相互作用的许多重要特征,并与其他全分子描述符一起,提供了一种信息丰富的方式来审查高通量筛选数据。我们在此背景下描述了简化图的一种新用途,引入了一种我们称为数据驱动聚类的方法,该方法可识别由特定全分子描述符表示且富含活性化合物的分子簇。