Hu Ye, Lounkine Eugen, Bajorath Jürgen
Department of Life Science Informatics, B-IT, LIMES, Rheinische Friedrich-Wilhelms-Universität Bonn, Germany.
ChemMedChem. 2009 Apr;4(4):540-8. doi: 10.1002/cmdc.200800408.
The Pipeline Pilot extended connectivity fingerprints (ECFPs) are currently among the most popular similarity search tools in drug discovery settings. ECFPs do not have a fixed bit string format but generate variable numbers of structural features for individual test molecules. This variable string design makes ECFP representations amenable to compound-class-directed modification. We have devised an intuitive feature-filtering technique that focuses ECFP search calculations on feature string ensembles of given compound activity classes. In combination with a simple bit-density-dependent similarity function, feature filtering consistently improved the search performance of ECFP calculations based on Tanimoto similarity and state-of-the-art data fusion techniques on a diverse array of activity classes. Feature filtering and the bit density similarity metric are easily implemented in the Pipeline Pilot environment. The approach provides a viable alternative to conventional similarity searching and should be of general interest to further improve the success rate of practical ECFP applications.
管道先导扩展连接性指纹(ECFPs)目前是药物发现环境中最受欢迎的相似性搜索工具之一。ECFPs没有固定的位串格式,而是为单个测试分子生成可变数量的结构特征。这种可变字符串设计使ECFP表示适合化合物类导向的修饰。我们设计了一种直观的特征过滤技术,将ECFP搜索计算集中在给定化合物活性类别的特征字符串集合上。结合一个简单的基于位密度的相似性函数,特征过滤在各种活性类别上,基于Tanimoto相似性和最新的数据融合技术,持续提高了ECFP计算的搜索性能。特征过滤和位密度相似性度量在管道先导环境中很容易实现。该方法为传统相似性搜索提供了一个可行的替代方案,对于进一步提高实际ECFP应用的成功率应该具有普遍意义。