• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

ccbmlib - 一个用于对谷本相似度值分布进行建模的Python包。

ccbmlib - a Python package for modeling Tanimoto similarity value distributions.

作者信息

Vogt Martin, Bajorath Jürgen

机构信息

Department of Life Science Informatics, B-IT, University of Bonn, Endenicher Allee 19c, Bonn, NRW, 53115, Germany.

出版信息

F1000Res. 2020 Feb 10;9. doi: 10.12688/f1000research.22292.2. eCollection 2020.

DOI:10.12688/f1000research.22292.2
PMID:32161645
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7050271/
Abstract

The ccbmlib Python package is a collection of modules for modeling similarity value distributions based on Tanimoto coefficients for fingerprints available in RDKit. It can be used to assess the statistical significance of Tanimoto coefficients and evaluate how molecular similarity is reflected when different fingerprint representations are used. Significance measures derived from -values allow a quantitative comparison of similarity scores obtained from different fingerprint representations that might have very different value ranges. Furthermore, the package models conditional distributions of similarity coefficients for a given reference compound. The conditional significance score estimates where a test compound would be ranked in a similarity search. The models are based on the statistical analysis of feature distributions and feature correlations of fingerprints of a reference database. The resulting models have been evaluated for 11 RDKit fingerprints, taking a collection of ChEMBL compounds as a reference data set. For most fingerprints, highly accurate models were obtained, with differences of 1% or less for Tanimoto coefficients indicating high similarity.

摘要

ccbmlib Python包是一组模块,用于基于RDKit中可用的指纹的Tanimoto系数对相似性值分布进行建模。它可用于评估Tanimoto系数的统计显著性,并评估在使用不同指纹表示时分子相似性是如何体现的。从p值导出的显著性度量允许对从可能具有非常不同值范围的不同指纹表示获得的相似性分数进行定量比较。此外,该包对给定参考化合物的相似性系数的条件分布进行建模。条件显著性分数估计测试化合物在相似性搜索中的排名。这些模型基于对参考数据库指纹的特征分布和特征相关性的统计分析。以ChEMBL化合物集合作为参考数据集,对11种RDKit指纹的结果模型进行了评估。对于大多数指纹,获得了高度准确的模型,Tanimoto系数的差异为1%或更小,表明相似性很高。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/20ed/7059788/b5213e88348f/f1000research-9-25012-g0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/20ed/7059788/0d4d7bcf9411/f1000research-9-25012-g0000.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/20ed/7059788/291d7ffed311/f1000research-9-25012-g0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/20ed/7059788/b5213e88348f/f1000research-9-25012-g0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/20ed/7059788/0d4d7bcf9411/f1000research-9-25012-g0000.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/20ed/7059788/291d7ffed311/f1000research-9-25012-g0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/20ed/7059788/b5213e88348f/f1000research-9-25012-g0002.jpg

相似文献

1
ccbmlib - a Python package for modeling Tanimoto similarity value distributions.ccbmlib - 一个用于对谷本相似度值分布进行建模的Python包。
F1000Res. 2020 Feb 10;9. doi: 10.12688/f1000research.22292.2. eCollection 2020.
2
Introduction of the conditional correlated Bernoulli model of similarity value distributions and its application to the prospective prediction of fingerprint search performance.条件相关的 Bernoulli 相似值分布模型介绍及其在指纹搜索性能的前瞻性预测中的应用。
J Chem Inf Model. 2011 Oct 24;51(10):2496-506. doi: 10.1021/ci2003472. Epub 2011 Sep 16.
3
Modeling Tanimoto Similarity Value Distributions and Predicting Search Results.模拟谷本相似度值分布并预测搜索结果。
Mol Inform. 2017 Jul;36(7). doi: 10.1002/minf.201600131. Epub 2016 Dec 29.
4
Bit silencing in fingerprints enables the derivation of compound class-directed similarity metrics.指纹中的位沉默能够推导出化合物类别导向的相似性度量。
J Chem Inf Model. 2008 Sep;48(9):1754-9. doi: 10.1021/ci8002045. Epub 2008 Aug 13.
5
Combinatorial preferences affect molecular similarity/diversity calculations using binary fingerprints and Tanimoto coefficients.组合偏好会影响使用二元指纹和塔尼莫托系数进行的分子相似性/多样性计算。
J Chem Inf Comput Sci. 2000 Jan;40(1):163-6. doi: 10.1021/ci990316u.
6
Shannon entropy-based fingerprint similarity search strategy.基于香农熵的指纹相似性搜索策略。
J Chem Inf Model. 2009 Jul;49(7):1687-91. doi: 10.1021/ci900159f.
7
Life beyond the Tanimoto coefficient: similarity measures for interaction fingerprints.超越谷本系数的生命:相互作用指纹的相似性度量
J Cheminform. 2018 Oct 4;10(1):48. doi: 10.1186/s13321-018-0302-y.
8
Anatomy of fingerprint search calculations on structurally diverse sets of active compounds.关于结构多样的活性化合物集的指纹搜索计算剖析。
J Chem Inf Model. 2005 Nov-Dec;45(6):1812-9. doi: 10.1021/ci050276w.
9
When is chemical similarity significant? The statistical distribution of chemical similarity scores and its extreme values.什么时候化学相似性具有重要意义?化学相似性得分的统计分布及其极值。
J Chem Inf Model. 2010 Jul 26;50(7):1205-22. doi: 10.1021/ci100010v.
10
Activity-relevant similarity values for fingerprints and implications for similarity searching.指纹的活动相关相似性值及其对相似性搜索的影响。
F1000Res. 2016 Apr 6;5. doi: 10.12688/f1000research.8357.2. eCollection 2016.

引用本文的文献

1
The anti-inflammatory activity of probiotic to activate Sirtuin-1 in inhibiting diabetic nephropathy progression.益生菌通过激活沉默信息调节因子1来抑制糖尿病肾病进展的抗炎活性。
J Diabetes Metab Disord. 2023 Aug 9;22(2):1425-1442. doi: 10.1007/s40200-023-01265-7. eCollection 2023 Dec.
2
Repurposing Drugs for Inhibition against ALDH2 via a 2D/3D Ligand-Based Similarity Search and Molecular Simulation.通过二维/三维基于配体相似性搜索和分子模拟的 ALDH2 抑制药物再利用。
Molecules. 2023 Oct 29;28(21):7325. doi: 10.3390/molecules28217325.
3
Retrosynthetic reaction pathway prediction through neural machine translation of atomic environments.
通过原子环境的神经机器翻译预测反合成反应途径。
Nat Commun. 2022 Mar 4;13(1):1186. doi: 10.1038/s41467-022-28857-w.
4
Pharmacological targeting of Sam68 functions in colorectal cancer stem cells.针对结直肠癌干细胞中Sam68功能的药理学靶向作用。
iScience. 2021 Nov 14;24(12):103442. doi: 10.1016/j.isci.2021.103442. eCollection 2021 Dec 17.