• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

iSIM-Sigma:用于分子相似性的高效标准差计算

iSIM-Sigma: Efficient Standard Deviation Calculation for Molecular Similarity.

作者信息

Lopez-Perez Kenneth, Zhao Bill, Miranda-Quintana Ramón Alain

机构信息

Department of Chemistry Quantum Theory Project, University of Florida, Gainesville, Florida 32611, United States.

出版信息

J Chem Inf Model. 2025 Jul 14;65(13):6797-6808. doi: 10.1021/acs.jcim.5c00894. Epub 2025 Jun 17.

DOI:10.1021/acs.jcim.5c00894
PMID:40528353
Abstract

The average and variance of the molecular similarities in a set are of high value and useful for cheminformatics tasks such as chemical space exploration and subset selection. However, the calculation of the variance of the complete similarity matrix has a quadratic complexity, (). As the sizes of molecular libraries constantly increase, this pairwise approach is unfeasible. In this work, we present an approach to calculate the exact standard deviation of molecular similarities in a set (with molecules and features) for the Russell-Rao (RR) and Sokal-Michener (SM) similarity indexes in () complexity. Furthermore, we present a highly accurate linear complexity approximation, (), based on sampling representative molecules from the set. The proposed approximation can be extended to other similarity indices, including the popular Jaccard-Tanimoto (JT). With only the sampling of 50 molecules, the proposed method can estimate the standard deviation of similarities in a set with an RMSE lower than 0.01 for sets of up to 50,000 molecules. In comparison, random sampling does not warrant a good approximation with the same number of selected molecules as shown in our results.

摘要

一组分子相似性的平均值和方差具有很高的价值,对于化学信息学任务(如化学空间探索和子集选择)很有用。然而,完整相似性矩阵方差的计算具有二次复杂度()。随着分子库规模不断增大,这种成对方法不可行。在这项工作中,我们提出了一种方法,用于在()复杂度下计算一组(具有分子和特征)分子相似性的精确标准差,适用于罗素 - 饶(RR)和索卡尔 - 米切纳(SM)相似性指数。此外,我们基于从该组中采样代表性分子,提出了一种高精度的线性复杂度近似方法()。所提出的近似方法可扩展到其他相似性指数,包括流行的杰卡德 - 谷本(JT)指数。仅通过采样50个分子,对于多达50,000个分子的集合,所提出的方法能够估计相似性的标准差,其均方根误差(RMSE)低于0.01。相比之下,如我们的结果所示,随机采样在选择相同数量分子时不能保证良好的近似效果。

相似文献

1
iSIM-Sigma: Efficient Standard Deviation Calculation for Molecular Similarity.iSIM-Sigma:用于分子相似性的高效标准差计算
J Chem Inf Model. 2025 Jul 14;65(13):6797-6808. doi: 10.1021/acs.jcim.5c00894. Epub 2025 Jun 17.
2
iSIM-sigma: efficient standard deviation calculation for molecular similarity.iSIM-sigma:用于分子相似性的高效标准差计算
bioRxiv. 2024 Nov 26:2024.11.24.625084. doi: 10.1101/2024.11.24.625084.
3
Short-Term Memory Impairment短期记忆障碍
4
Signs and symptoms to determine if a patient presenting in primary care or hospital outpatient settings has COVID-19.在基层医疗机构或医院门诊环境中,如果患者出现以下症状和体征,可判断其是否患有 COVID-19。
Cochrane Database Syst Rev. 2022 May 20;5(5):CD013665. doi: 10.1002/14651858.CD013665.pub3.
5
Carbon dioxide detection for diagnosis of inadvertent respiratory tract placement of enterogastric tubes in children.用于诊断儿童肠胃管意外置入呼吸道的二氧化碳检测
Cochrane Database Syst Rev. 2025 Feb 19;2(2):CD011196. doi: 10.1002/14651858.CD011196.pub2.
6
Sexual Harassment and Prevention Training性骚扰与预防培训
7
The Black Book of Psychotropic Dosing and Monitoring.《精神药物剂量与监测黑皮书》
Psychopharmacol Bull. 2024 Jul 8;54(3):8-59.
8
Systemic pharmacological treatments for chronic plaque psoriasis: a network meta-analysis.慢性斑块状银屑病的全身药理学治疗:一项网状荟萃分析。
Cochrane Database Syst Rev. 2017 Dec 22;12(12):CD011535. doi: 10.1002/14651858.CD011535.pub2.
9
Can a Liquid Biopsy Detect Circulating Tumor DNA With Low-passage Whole-genome Sequencing in Patients With a Sarcoma? A Pilot Evaluation.液体活检能否通过低深度全基因组测序检测肉瘤患者的循环肿瘤DNA?一项初步评估。
Clin Orthop Relat Res. 2025 Jan 1;483(1):39-48. doi: 10.1097/CORR.0000000000003161. Epub 2024 Jun 21.
10
Antidepressants for pain management in adults with chronic pain: a network meta-analysis.抗抑郁药治疗成人慢性疼痛的疼痛管理:一项网络荟萃分析。
Health Technol Assess. 2024 Oct;28(62):1-155. doi: 10.3310/MKRT2948.

引用本文的文献

1
Undersampling techniques for non-linear chemical space visualization.用于非线性化学空间可视化的欠采样技术。
bioRxiv. 2025 Jul 7:2025.07.03.663077. doi: 10.1101/2025.07.03.663077.