• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

iSIM:即时相似度。

iSIM: instant similarity.

作者信息

López-Pérez Kenneth, Kim Taewon D, Miranda-Quintana Ramón Alain

机构信息

Department of Chemistry and Quantum Theory Project, University of Florida Gainesville Florida 32611 USA

出版信息

Digit Discov. 2024 May 7;3(6):1160-1171. doi: 10.1039/d4dd00041b. eCollection 2024 Jun 12.

DOI:10.1039/d4dd00041b
PMID:38873032
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11167700/
Abstract

The quantification of molecular similarity has been present since the beginning of cheminformatics. Although several similarity indices and molecular representations have been reported, all of them ultimately reduce to the calculation of molecular similarities of only two objects at a time. Hence, to obtain the average similarity of a set of molecules, all the pairwise comparisons need to be computed, which demands a quadratic scaling in the number of computational resources. Here we propose an exact alternative to this problem: iSIM (instant similarity). iSIM performs comparisons of multiple molecules at the same time and yields the same value as the average pairwise comparisons of molecules represented by binary fingerprints and real-value descriptors. In this work, we introduce the mathematical framework and several applications of iSIM in chemical sampling, visualization, diversity selection, and clustering.

摘要

自化学信息学诞生之初,分子相似性的量化就已存在。尽管已经报道了多种相似性指数和分子表示方法,但它们最终都归结为每次仅计算两个对象之间的分子相似性。因此,为了获得一组分子的平均相似性,需要计算所有的成对比较,这需要与计算资源数量成二次方比例的计算量。在此,我们针对此问题提出了一种精确的替代方法:即时相似性(iSIM)。iSIM 可同时对多个分子进行比较,并产生与由二进制指纹和实值描述符表示的分子的平均成对比较相同的值。在这项工作中,我们介绍了 iSIM 的数学框架及其在化学采样、可视化、多样性选择和聚类中的若干应用。

相似文献

1
iSIM: instant similarity.iSIM:即时相似度。
Digit Discov. 2024 May 7;3(6):1160-1171. doi: 10.1039/d4dd00041b. eCollection 2024 Jun 12.
2
Extended similarity indices: the benefits of comparing more than two objects simultaneously. Part 1: Theory and characteristics.扩展相似性指数:同时比较两个以上对象的益处。第1部分:理论与特征。
J Cheminform. 2021 Apr 23;13(1):32. doi: 10.1186/s13321-021-00505-3.
3
Extended similarity indices: the benefits of comparing more than two objects simultaneously. Part 2: speed, consistency, diversity selection.扩展相似性指数:同时比较两个以上对象的益处。第2部分:速度、一致性、多样性选择。
J Cheminform. 2021 Apr 23;13(1):33. doi: 10.1186/s13321-021-00504-4.
4
Efficient clustering of large molecular libraries.大型分子文库的高效聚类
bioRxiv. 2024 Aug 10:2024.08.10.607459. doi: 10.1101/2024.08.10.607459.
5
Extended continuous similarity indices: theory and application for QSAR descriptor selection.扩展连续相似性指数:QSAR 描述符选择的理论与应用。
J Comput Aided Mol Des. 2022 Mar;36(3):157-173. doi: 10.1007/s10822-022-00444-7. Epub 2022 Mar 15.
6
Extended many-item similarity indices for sets of nucleotide and protein sequences.针对核苷酸和蛋白质序列集的扩展多项目相似性指数。
Comput Struct Biotechnol J. 2021 Jun 16;19:3628-3639. doi: 10.1016/j.csbj.2021.06.021. eCollection 2021.
7
Distance phenomena in high-dimensional chemical descriptor spaces: consequences for similarity-based approaches.高维化学描述符空间中的距离现象:对基于相似度方法的影响。
J Comput Chem. 2009 Nov 15;30(14):2285-96. doi: 10.1002/jcc.21218.
8
Blocked inverted indices for exact clustering of large chemical spaces.用于大型化学空间精确聚类的阻塞倒排索引。
J Chem Inf Model. 2014 Sep 22;54(9):2395-401. doi: 10.1021/ci500150t. Epub 2014 Sep 2.
9
A Step-by-Step Guide to Instant Structured Illumination Microscopy (iSIM).即时结构照明显微镜(iSIM)操作指南
Methods Mol Biol. 2021;2304:347-359. doi: 10.1007/978-1-0716-1402-0_19.
10
GO functional similarity clustering depends on similarity measure, clustering method, and annotation completeness.GO 功能相似性聚类取决于相似性度量、聚类方法和注释完整性。
BMC Bioinformatics. 2019 Mar 27;20(1):155. doi: 10.1186/s12859-019-2752-2.

引用本文的文献

1
Undersampling techniques for non-linear chemical space visualization.用于非线性化学空间可视化的欠采样技术。
bioRxiv. 2025 Jul 7:2025.07.03.663077. doi: 10.1101/2025.07.03.663077.
2
Scaling -Means for Multi-Million Frames: A Stratified NANI Approach for Large-Scale MD Simulations.数百万帧的缩放方法:一种用于大规模分子动力学模拟的分层非自适应邻居搜索方法
bioRxiv. 2025 Jun 18:2025.06.15.659780. doi: 10.1101/2025.06.15.659780.
3
iCliff Taylor's Version: Robust and Efficient Activity Cliff Determination.iCliff泰勒版本:稳健且高效的活性悬崖判定

本文引用的文献

1
Sampling and Mapping Chemical Space with Extended Similarity Indices.使用扩展相似性指数进行化学空间的采样与映射
Molecules. 2023 Aug 30;28(17):6333. doi: 10.3390/molecules28176333.
2
Chemical Library Space: Definition and DNA-Encoded Library Comparison Study Case.化学库空间:定义和 DNA 编码库比较研究案例。
J Chem Inf Model. 2023 Jul 10;63(13):4042-4055. doi: 10.1021/acs.jcim.3c00520. Epub 2023 Jun 27.
3
Exploring activity landscapes with extended similarity: is Tanimoto enough?用扩展相似度探索活动景观:Tanimoto 足够吗?
J Chem Inf Model. 2025 Jun 9;65(11):5801-5810. doi: 10.1021/acs.jcim.5c00506. Epub 2025 May 21.
4
SHINE: Deterministic Many-to-Many Clustering of Molecular Pathways.SHINE:分子通路的确定性多对多聚类
J Chem Inf Model. 2025 May 26;65(10):4775-4782. doi: 10.1021/acs.jcim.5c00240. Epub 2025 May 6.
5
Extended Quality (eQual): Radial Threshold Clustering Based on -ary Similarity.扩展质量(eQual):基于 - 元相似度的径向阈值聚类
J Chem Inf Model. 2025 May 26;65(10):5062-5070. doi: 10.1021/acs.jcim.4c02341. Epub 2025 May 1.
6
BitBIRCH Clustering Refinement Strategies.BitBIRCH聚类优化策略。
bioRxiv. 2025 Mar 24:2025.03.20.644337. doi: 10.1101/2025.03.20.644337.
7
Hierarchical Extended Linkage Method (HELM)'s Deep Dive into Hybrid Clustering Strategies.分层扩展链接方法(HELM)对混合聚类策略的深入研究。
bioRxiv. 2025 Mar 10:2025.03.05.641742. doi: 10.1101/2025.03.05.641742.
8
iCliff Taylor's version: Robust and Efficient Activity Cliff Determination.iCliff泰勒版本:稳健且高效的活性悬崖判定
bioRxiv. 2025 Mar 13:2025.03.09.642269. doi: 10.1101/2025.03.09.642269.
9
BitBIRCH: efficient clustering of large molecular libraries.BitBIRCH:大型分子文库的高效聚类
Digit Discov. 2025 Mar 13;4(4):1042-1051. doi: 10.1039/d5dd00030k. eCollection 2025 Apr 9.
10
CADENCE: Clustering Algorithm - Density-based Exploration and Novelty Clustering with Efficiency.CADENCE:聚类算法——基于密度的探索与高效新颖性聚类
bioRxiv. 2025 Feb 28:2025.02.24.639863. doi: 10.1101/2025.02.24.639863.
Mol Inform. 2023 Jul;42(7):e2300056. doi: 10.1002/minf.202300056. Epub 2023 Jun 7.
4
Exposing the Limitations of Molecular Machine Learning with Activity Cliffs.利用活性悬崖揭示分子机器学习的局限性。
J Chem Inf Model. 2022 Dec 12;62(23):5938-5951. doi: 10.1021/acs.jcim.2c01073. Epub 2022 Dec 1.
5
Quantum Chemical Roots of Machine-Learning Molecular Similarity Descriptors.机器学习分子相似性描述符的量子化学根源
J Chem Theory Comput. 2022 Nov 8;18(11):6670-6689. doi: 10.1021/acs.jctc.2c00718. Epub 2022 Oct 11.
6
Chemical similarity of molecules with physiological response.分子的化学相似性与生理反应。
Mol Divers. 2023 Aug;27(4):1603-1612. doi: 10.1007/s11030-022-10514-5. Epub 2022 Aug 17.
7
Graph-based molecular Pareto optimisation.基于图的分子帕累托优化。
Chem Sci. 2022 Jun 2;13(25):7526-7535. doi: 10.1039/d2sc00821a. eCollection 2022 Jun 29.
8
Coverage Score: A Model Agnostic Method to Efficiently Explore Chemical Space.覆盖率得分:一种高效探索化学空间的模型不可知方法。
J Chem Inf Model. 2022 Sep 26;62(18):4391-4402. doi: 10.1021/acs.jcim.2c00258. Epub 2022 Jul 22.
9
Molecular Dynamics Simulations and Diversity Selection by Extended Continuous Similarity Indices.分子动力学模拟与通过扩展连续相似性指数进行的多样性选择。
J Chem Inf Model. 2022 Jul 25;62(14):3415-3425. doi: 10.1021/acs.jcim.2c00433. Epub 2022 Jul 14.
10
Extended continuous similarity indices: theory and application for QSAR descriptor selection.扩展连续相似性指数:QSAR 描述符选择的理论与应用。
J Comput Aided Mol Des. 2022 Mar;36(3):157-173. doi: 10.1007/s10822-022-00444-7. Epub 2022 Mar 15.