• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于统计的数据库指纹:化合物数据库的化学空间依赖性表示。

Statistical-based database fingerprint: chemical space dependent representation of compound databases.

作者信息

Sánchez-Cruz Norberto, Medina-Franco José L

机构信息

Department of Pharmacy, School of Chemistry, Universidad Nacional Autónoma de México, Avenida Universidad 3000, 04510, Mexico City, Mexico.

出版信息

J Cheminform. 2018 Nov 22;10(1):55. doi: 10.1186/s13321-018-0311-x.

DOI:10.1186/s13321-018-0311-x
PMID:30467740
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6755589/
Abstract

BACKGROUND

Simplified representation of compound databases has several applications in cheminformatics. Herein, we introduce an alternative and general method to build single fingerprint representations of compound databases. The approach is inspired on the previously published modal fingerprints that are aimed to capture the most significant bits of a fingerprint representation for a compound data set. The novelty of the herein proposed statistical-based database fingerprint (SB-DFP) is that it is generated based on binomial proportions comparisons taking as reference the distribution of "1" bits on a large representative set of the chemical space.

RESULTS

To illustrate the Method, SB-DFPs were constructed for 28 epigenetic target data sets retrieved from a recently published epigenomics database of interest in probe and drug discovery. For each target data set, the SB-DFPs were built based on two representative fingerprints of different design using as reference a data set with more than 15 million compounds from ZINC. The application of SB-DFP was illustrated and compared to other methods through association relationships of the 28 epigenetic data sets and similarity searching. It was found that SB-DFPs captured overall, the common features between data sets and the distinct features of each set. In similarity searching SB-DFP equaled or outperformed other approaches for at least 20 out of the 28 sets.

CONCLUSIONS

SB-DFP is a general approach based on binomial proportion comparisons to represent a compound data set with a single fingerprint. SB-DFP can be developed, at least in principle, based on any fingerprint and reference data set. SB-DFP is a good alternative for exploration of relationships between targets through its associated compound data sets and performing similarity searching.

摘要

背景

化合物数据库的简化表示在化学信息学中有多种应用。在此,我们介绍一种构建化合物数据库单指纹表示的替代通用方法。该方法的灵感来源于先前发表的模态指纹,其旨在捕捉化合物数据集指纹表示中最重要的位。本文提出的基于统计的数据库指纹(SB-DFP)的新颖之处在于,它是基于二项式比例比较生成的,以化学空间的大型代表性集合上“1”位的分布为参考。

结果

为说明该方法,针对从最近发表的一个与探针和药物发现相关的表观基因组学数据库中检索到的28个表观遗传靶点数据集构建了SB-DFP。对于每个靶点数据集,基于两种不同设计的代表性指纹构建SB-DFP,以来自ZINC的超过1500万种化合物的数据集作为参考。通过28个表观遗传数据集的关联关系和相似性搜索说明了SB-DFP的应用,并与其他方法进行了比较。发现SB-DFP总体上捕捉到了数据集之间的共同特征以及每个数据集的独特特征。在相似性搜索中,SB-DFP在28个数据集中至少有20个等于或优于其他方法。

结论

SB-DFP是一种基于二项式比例比较的通用方法,用于用单个指纹表示化合物数据集。至少在原则上,SB-DFP可以基于任何指纹和参考数据集来开发。SB-DFP是通过其相关的化合物数据集探索靶点之间关系以及进行相似性搜索的良好替代方法。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0ef8/6755589/b82324e0c2ee/13321_2018_311_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0ef8/6755589/3d25ec77a841/13321_2018_311_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0ef8/6755589/649ddf19e467/13321_2018_311_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0ef8/6755589/0119d362ca4f/13321_2018_311_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0ef8/6755589/b82324e0c2ee/13321_2018_311_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0ef8/6755589/3d25ec77a841/13321_2018_311_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0ef8/6755589/649ddf19e467/13321_2018_311_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0ef8/6755589/0119d362ca4f/13321_2018_311_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0ef8/6755589/b82324e0c2ee/13321_2018_311_Fig4_HTML.jpg

相似文献

1
Statistical-based database fingerprint: chemical space dependent representation of compound databases.基于统计的数据库指纹:化合物数据库的化学空间依赖性表示。
J Cheminform. 2018 Nov 22;10(1):55. doi: 10.1186/s13321-018-0311-x.
2
Database fingerprint (DFP): an approach to represent molecular databases.数据库指纹(DFP):一种表示分子数据库的方法。
J Cheminform. 2017 Feb 6;9:9. doi: 10.1186/s13321-017-0195-1. eCollection 2017.
3
Random reduction in fingerprint bit density improves compound recall in search calculations using complex reference molecules.在使用复杂参考分子的搜索计算中,随机降低指纹位密度可提高化合物召回率。
Chem Biol Drug Des. 2008 Jun;71(6):511-7. doi: 10.1111/j.1747-0285.2008.00664.x. Epub 2008 May 7.
4
How do 2D fingerprints detect structurally diverse active compounds? Revealing compound subset-specific fingerprint features through systematic selection.2D 指纹如何检测结构多样的活性化合物?通过系统选择揭示化合物子集特异性指纹特征。
J Chem Inf Model. 2011 Sep 26;51(9):2254-65. doi: 10.1021/ci200275m. Epub 2011 Aug 8.
5
Shannon entropy-based fingerprint similarity search strategy.基于香农熵的指纹相似性搜索策略。
J Chem Inf Model. 2009 Jul;49(7):1687-91. doi: 10.1021/ci900159f.
6
Introduction of a generally applicable method to estimate retrieval of active molecules for similarity searching using fingerprints.介绍一种使用指纹来估计活性分子检索以进行相似性搜索的通用方法。
ChemMedChem. 2007 Sep;2(9):1311-20. doi: 10.1002/cmdc.200700090.
7
Stereoselective virtual screening of the ZINC database using atom pair 3D-fingerprints.使用原子对三维指纹对ZINC数据库进行立体选择性虚拟筛选。
J Cheminform. 2015 Feb 10;7:3. doi: 10.1186/s13321-014-0051-5. eCollection 2015.
8
Similarity metrics for ligands reflecting the similarity of the target proteins.反映靶蛋白相似性的配体相似性度量。
J Chem Inf Comput Sci. 2003 Mar-Apr;43(2):391-405. doi: 10.1021/ci025569t.
9
Development of a fingerprint reduction approach for Bayesian similarity searching based on Kullback-Leibler divergence analysis.基于库尔贝克-莱布勒散度分析的贝叶斯相似性搜索指纹约简方法的开发。
J Chem Inf Model. 2009 Jun;49(6):1347-58. doi: 10.1021/ci900087y.
10
Anatomy of fingerprint search calculations on structurally diverse sets of active compounds.关于结构多样的活性化合物集的指纹搜索计算剖析。
J Chem Inf Model. 2005 Nov-Dec;45(6):1812-9. doi: 10.1021/ci050276w.

引用本文的文献

1
Paths to Cheminformatics: Q&A with Norberto Sánchez-Cruz and Emma Schymanski.化学信息学之路:与诺贝托·桑切斯-克鲁兹和艾玛·施曼斯基的问答
J Cheminform. 2022 Aug 2;14(1):51. doi: 10.1186/s13321-022-00628-1.
2
Applications of Virtual Screening in Bioprospecting: Facts, Shifts, and Perspectives to Explore the Chemo-Structural Diversity of Natural Products.虚拟筛选在生物勘探中的应用:探索天然产物化学结构多样性的事实、转变与展望
Front Chem. 2021 Apr 29;9:662688. doi: 10.3389/fchem.2021.662688. eCollection 2021.
3
Extended similarity indices: the benefits of comparing more than two objects simultaneously. Part 2: speed, consistency, diversity selection.

本文引用的文献

1
Computer-Aided Drug Design in Epigenetics.表观遗传学中的计算机辅助药物设计
Front Chem. 2018 Mar 12;6:57. doi: 10.3389/fchem.2018.00057. eCollection 2018.
2
Insights from pharmacological similarity of epigenetic targets in epipolypharmacology.表型药理学中表观遗传靶点的药理学相似性研究进展
Drug Discov Today. 2018 Jan;23(1):141-150. doi: 10.1016/j.drudis.2017.10.006. Epub 2017 Oct 14.
3
Database fingerprint (DFP): an approach to represent molecular databases.数据库指纹(DFP):一种表示分子数据库的方法。
扩展相似性指数:同时比较两个以上对象的益处。第2部分:速度、一致性、多样性选择。
J Cheminform. 2021 Apr 23;13(1):33. doi: 10.1186/s13321-021-00504-4.
4
Recent progress on cheminformatics approaches to epigenetic drug discovery.近年来化学生物信息学方法在表观遗传药物发现中的研究进展。
Drug Discov Today. 2020 Dec;25(12):2268-2276. doi: 10.1016/j.drudis.2020.09.021. Epub 2020 Sep 30.
5
Cheminformatics in Natural Product-based Drug Discovery.天然产物药物发现中的 cheminformatics。
Mol Inform. 2020 Dec;39(12):e2000171. doi: 10.1002/minf.202000171. Epub 2020 Sep 6.
J Cheminform. 2017 Feb 6;9:9. doi: 10.1186/s13321-017-0195-1. eCollection 2017.
4
UniProt: the universal protein knowledgebase.通用蛋白质知识库:UniProt
Nucleic Acids Res. 2017 Jan 4;45(D1):D158-D169. doi: 10.1093/nar/gkw1099. Epub 2016 Nov 29.
5
An overview of molecular fingerprint similarity search in virtual screening.虚拟筛选中分子指纹相似性搜索概述
Expert Opin Drug Discov. 2016;11(2):137-48. doi: 10.1517/17460441.2016.1117070. Epub 2015 Dec 4.
6
Why is Tanimoto index an appropriate choice for fingerprint-based similarity calculations?为什么田本系数是基于指纹的相似性计算的合适选择?
J Cheminform. 2015 May 20;7:20. doi: 10.1186/s13321-015-0069-3. eCollection 2015.
7
Molecular fingerprint similarity search in virtual screening.虚拟筛选中的分子指纹相似性搜索。
Methods. 2015 Jan;71:58-63. doi: 10.1016/j.ymeth.2014.08.005. Epub 2014 Aug 15.
8
Fingerprint design and engineering strategies: rationalizing and improving similarity search performance.指纹设计与工程策略:优化和提高相似性搜索性能。
Future Med Chem. 2012 Oct;4(15):1945-59. doi: 10.4155/fmc.12.126.
9
ZINC: a free tool to discover chemistry for biology.ZINC:一款用于生物学的免费化学发现工具。
J Chem Inf Model. 2012 Jul 23;52(7):1757-68. doi: 10.1021/ci3001277. Epub 2012 Jun 15.
10
Fast, scalable generation of high-quality protein multiple sequence alignments using Clustal Omega.使用 Clustal Omega 快速、可扩展地生成高质量蛋白质多重序列比对。
Mol Syst Biol. 2011 Oct 11;7:539. doi: 10.1038/msb.2011.75.