Vogt Martin, Bajorath Jürgen
Department of Life Science Informatics, B-IT, Rheinische Friedrich-Wilhelms-Universität, Dahlmannstr. 2, D-53113 Bonn, Germany.
Chem Biol Drug Des. 2008 Jan;71(1):8-14. doi: 10.1111/j.1747-0285.2007.00602.x. Epub 2007 Dec 7.
Bayesian classifiers are increasingly being used to distinguish active from inactive compounds and search large databases for novel active molecules. We introduce an approach to directly combine the contributions of property descriptors and molecular fingerprints in the search for active compounds that is based on a Bayesian framework. Conventionally, property descriptors and fingerprints are used as alternative features for virtual screening methods. Following the approach introduced here, probability distributions of descriptor values and fingerprint bit settings are calculated for active and database molecules and the divergence between the resulting combined distributions is determined as a measure of biological activity. In test calculations on a large number of compound activity classes, this methodology was found to consistently perform better than similarity searching using fingerprints and multiple reference compounds or Bayesian screening calculations using probability distributions calculated only from property descriptors. These findings demonstrate that there is considerable synergy between different types of property descriptors and fingerprints in recognizing diverse structure-activity relationships, at least in the context of Bayesian modeling.
贝叶斯分类器越来越多地用于区分活性化合物和非活性化合物,并在大型数据库中搜索新型活性分子。我们引入了一种基于贝叶斯框架的方法,在寻找活性化合物时直接结合性质描述符和分子指纹的贡献。传统上,性质描述符和指纹用作虚拟筛选方法的替代特征。按照这里介绍的方法,计算活性分子和数据库分子的描述符值概率分布和指纹位设置,并将所得组合分布之间的差异确定为生物活性的度量。在对大量化合物活性类别的测试计算中,发现该方法始终比使用指纹和多个参考化合物的相似性搜索或仅使用从性质描述符计算的概率分布的贝叶斯筛选计算表现更好。这些发现表明,至少在贝叶斯建模的背景下,不同类型的性质描述符和指纹在识别不同的构效关系方面存在相当大的协同作用。