• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

结合库尔贝克-莱布勒描述符散度分析的高维描述符空间中的贝叶斯相似性搜索。

Bayesian similarity searching in high-dimensional descriptor spaces combined with Kullback-Leibler descriptor divergence analysis.

作者信息

Vogt Martin, Bajorath Jürgen

机构信息

Department of Life Science Informatics, B-IT, Rheinische Friedrich-Wilhelms-Universität, Dahlmannstrasse 2, D-53113 Bonn, Germany.

出版信息

J Chem Inf Model. 2008 Feb;48(2):247-55. doi: 10.1021/ci700333t. Epub 2008 Jan 30.

DOI:10.1021/ci700333t
PMID:18229907
Abstract

We investigate an approach that combines Bayesian modeling of probability distributions of descriptor values of active and database molecules with Kullback-Leibler analysis of the divergence between these distributions. The methodology is used for Bayesian screening and also to predict compound recall rates. In our study, we analyze two fundamental approximations underlying the Bayesian screening approach: the assumption that descriptors are independent of each other and, furthermore, that their data set values follow normal distributions. In addition, we calculate Kullback-Leibler divergence for single descriptors, rather than multiple-feature distributions, in order to prioritize descriptors for screening calculations. The results show that descriptor correlation effects, violating the assumption of feature independence, can lead to notable reduction of compound recall in Bayesian screening. Controlling descriptor correlation effects play a much more significant role for achieving high recall rates than approximating descriptor distributions by Gaussians. Furthermore, Kullback-Leibler divergence analysis is shown to systematically identify descriptors that are the most relevant for the outcome of Bayesian screening calculations.

摘要

我们研究了一种方法,该方法将活性分子和数据库分子描述符值的概率分布的贝叶斯建模与这些分布之间差异的库尔贝克-莱布勒分析相结合。该方法用于贝叶斯筛选,也用于预测化合物召回率。在我们的研究中,我们分析了贝叶斯筛选方法背后的两个基本近似:描述符相互独立的假设,以及此外它们的数据集值遵循正态分布的假设。此外,我们计算单个描述符的库尔贝克-莱布勒散度,而不是多特征分布的散度,以便为筛选计算确定描述符的优先级。结果表明,违反特征独立性假设的描述符相关效应会导致贝叶斯筛选中化合物召回率显著降低。控制描述符相关效应对于实现高召回率比用高斯分布近似描述符分布起着更为重要的作用。此外,库尔贝克-莱布勒散度分析被证明可以系统地识别与贝叶斯筛选计算结果最相关的描述符。

相似文献

1
Bayesian similarity searching in high-dimensional descriptor spaces combined with Kullback-Leibler descriptor divergence analysis.结合库尔贝克-莱布勒描述符散度分析的高维描述符空间中的贝叶斯相似性搜索。
J Chem Inf Model. 2008 Feb;48(2):247-55. doi: 10.1021/ci700333t. Epub 2008 Jan 30.
2
Bayesian screening for active compounds in high-dimensional chemical spaces combining property descriptors and molecular fingerprints.结合性质描述符和分子指纹的高维化学空间中活性化合物的贝叶斯筛选
Chem Biol Drug Des. 2008 Jan;71(1):8-14. doi: 10.1111/j.1747-0285.2007.00602.x. Epub 2007 Dec 7.
3
Introduction of an information-theoretic method to predict recovery rates of active compounds for Bayesian in silico screening: theory and screening trials.一种用于预测贝叶斯计算机辅助筛选中活性化合物回收率的信息论方法介绍:理论与筛选试验
J Chem Inf Model. 2007 Mar-Apr;47(2):337-41. doi: 10.1021/ci600418u. Epub 2007 Feb 16.
4
Development of a fingerprint reduction approach for Bayesian similarity searching based on Kullback-Leibler divergence analysis.基于库尔贝克-莱布勒散度分析的贝叶斯相似性搜索指纹约简方法的开发。
J Chem Inf Model. 2009 Jun;49(6):1347-58. doi: 10.1021/ci900087y.
5
Bayesian interpretation of a distance function for navigating high-dimensional descriptor spaces.用于高维描述符空间导航的距离函数的贝叶斯解释。
J Chem Inf Model. 2007 Jan-Feb;47(1):39-46. doi: 10.1021/ci600280b.
6
Predicting the performance of fingerprint similarity searching.预测指纹相似性搜索的性能。
Methods Mol Biol. 2011;672:159-73. doi: 10.1007/978-1-60761-839-3_6.
7
Introduction of a generally applicable method to estimate retrieval of active molecules for similarity searching using fingerprints.介绍一种使用指纹来估计活性分子检索以进行相似性搜索的通用方法。
ChemMedChem. 2007 Sep;2(9):1311-20. doi: 10.1002/cmdc.200700090.
8
How similar are similarity searching methods? A principal component analysis of molecular descriptor space.相似性搜索方法的相似程度如何?分子描述符空间的主成分分析。
J Chem Inf Model. 2009 Jan;49(1):108-19. doi: 10.1021/ci800249s.
9
How do 2D fingerprints detect structurally diverse active compounds? Revealing compound subset-specific fingerprint features through systematic selection.2D 指纹如何检测结构多样的活性化合物?通过系统选择揭示化合物子集特异性指纹特征。
J Chem Inf Model. 2011 Sep 26;51(9):2254-65. doi: 10.1021/ci200275m. Epub 2011 Aug 8.
10
Similarity searching of chemical databases using atom environment descriptors (MOLPRINT 2D): evaluation of performance.使用原子环境描述符(MOLPRINT 2D)对化学数据库进行相似性搜索:性能评估
J Chem Inf Comput Sci. 2004 Sep-Oct;44(5):1708-18. doi: 10.1021/ci0498719.

引用本文的文献

1
Analysis of Cell Signal Transduction Based on Kullback-Leibler Divergence: Channel Capacity and Conservation of Its Production Rate during Cascade.基于库尔贝克-莱布勒散度的细胞信号转导分析:级联过程中的通道容量及其产生率的守恒
Entropy (Basel). 2018 Jun 5;20(6):438. doi: 10.3390/e20060438.