Xue L, Godden J W, Bajorath J
Department of Computer-Aided Drug Discovery, Albany Molecular Research, Inc. (AMRI), Bothell Research Center (AMRI-BRC), 18804 North Creek Pkwy, Bothell, WA 98011, USA.
SAR QSAR Environ Res. 2003 Feb;14(1):27-40. doi: 10.1080/1062936021000058764.
Binary fingerprint representations of molecular structure and properties are convenient computational tools for similarity searching in compound databases and virtual screening (VS). We are investigating the design of relatively simple fingerprints for the identification of molecules having similar biological activity and recognition of remote similarity relationships. Since our designs are considerably shorter than other fingerprints used in VS, we have previously termed them "mini-fingerprints" (MFPs). A key aspect of the design strategy is the identification of suitable molecular descriptors. Whereas our initial fingerprint designs have relied on descriptor combinations that performed well in compound classification according to biological activity, second generation MFPs encode combinations of descriptors with high information content in large compound databases and high frequency of occurrence in drug-like molecules. Thus, the design of these new fingerprints does not depend on the analysis of specific classes of bioactive compounds, but rather on descriptor information content in large compound databases. Systematic evaluation of fingerprint performance in VS test calculations demonstrates that these new prototypes perform better than previously generated MFPs. The analysis described herein provides an example for the development of search tools for VS.
分子结构和性质的二元指纹表示法是用于化合物数据库相似性搜索和虚拟筛选(VS)的便捷计算工具。我们正在研究设计相对简单的指纹,用于识别具有相似生物活性的分子以及识别远程相似关系。由于我们设计的指纹比VS中使用的其他指纹要短得多,我们之前将它们称为“微型指纹”(MFP)。设计策略的一个关键方面是确定合适的分子描述符。我们最初的指纹设计依赖于在根据生物活性进行化合物分类时表现良好的描述符组合,而第二代MFP则编码了在大型化合物数据库中具有高信息含量且在类药物分子中出现频率高的描述符组合。因此,这些新指纹的设计并不依赖于对特定类别的生物活性化合物的分析,而是依赖于大型化合物数据库中的描述符信息含量。在VS测试计算中对指纹性能的系统评估表明,这些新原型的性能优于之前生成的MFP。本文所述的分析为VS搜索工具的开发提供了一个范例。