Wang Yuan, Bajorath Jürgen
Department of Life Science Informatics, B-IT, LIMES Program Unit Chemical Biology & Medicinal Chemistry, Rheinische Friedrich-Wilhelms-Universität, Dahlmannstrasse 2, D-53113 Bonn, Germany.
J Chem Inf Model. 2008 Sep;48(9):1754-9. doi: 10.1021/ci8002045. Epub 2008 Aug 13.
Fingerprints are molecular bit string representations and are among the most popular descriptors for similarity searching. In key-type fingerprints, each bit position monitors the presence or absence of a prespecified chemical or structural feature. In contrast to hashed fingerprints, this keyed design makes it possible to evaluate individual bit positions and the associated structural features during similarity searching. Bit silencing is introduced as a systematic approach to assess the contribution of each bit in a fingerprint to similarity search performance. From the resulting bit contribution profile, a bit position-dependent weight vector is derived that determines the relative weight of each bit on the basis of its individual contribution. By merging this weight vector with the Tanimoto coefficient, compound class-directed similarity metrics are obtained that further increase fingerprint search calculations compared to conventional calculations of Tanimoto similarity.
指纹是分子位串表示形式,是相似性搜索中最常用的描述符之一。在键型指纹中,每个位位置监测预先指定的化学或结构特征的存在与否。与哈希指纹不同,这种键控设计使得在相似性搜索期间能够评估各个位位置以及相关的结构特征。引入位沉默作为一种系统方法,以评估指纹中每个位对相似性搜索性能的贡献。从所得的位贡献概况中,导出一个基于位位置的权重向量,该向量根据每个位的个体贡献确定其相对权重。通过将此权重向量与塔尼莫托系数合并,获得了化合物类别导向的相似性度量,与传统的塔尼莫托相似性计算相比,进一步增加了指纹搜索计算。