• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

将 Mondrian 交叉保形预测应用于大型不平衡生物活性数据集的预测置信度估计。

Applying Mondrian Cross-Conformal Prediction To Estimate Prediction Confidence on Large Imbalanced Bioactivity Data Sets.

机构信息

Swetox, Karolinska Institutet , Unit of Toxicology Sciences, Södertälje 15136, Sweden.

出版信息

J Chem Inf Model. 2017 Jul 24;57(7):1591-1598. doi: 10.1021/acs.jcim.7b00159. Epub 2017 Jun 30.

DOI:10.1021/acs.jcim.7b00159
PMID:28628322
Abstract

Conformal prediction has been proposed as a more rigorous way to define prediction confidence compared to other application domain concepts that have earlier been used for QSAR modeling. One main advantage of such a method is that it provides a prediction region potentially with multiple predicted labels, which contrasts to the single valued (regression) or single label (classification) output predictions by standard QSAR modeling algorithms. Standard conformal prediction might not be suitable for imbalanced data sets. Therefore, Mondrian cross-conformal prediction (MCCP) which combines the Mondrian inductive conformal prediction with cross-fold calibration sets has been introduced. In this study, the MCCP method was applied to 18 publicly available data sets that have various imbalance levels varying from 1:10 to 1:1000 (ratio of active/inactive compounds). Our results show that MCCP in general performed well on bioactivity data sets with various imbalance levels. More importantly, the method not only provides confidence of prediction and prediction regions compared to standard machine learning methods but also produces valid predictions for the minority class. In addition, a compound similarity based nonconformity measure was investigated. Our results demonstrate that although it gives valid predictions, its efficiency is much worse than that of model dependent metrics.

摘要

与其他更早应用于 QSAR 建模的应用领域概念相比,保形预测被提出作为一种更严格的定义预测置信度的方法。这种方法的一个主要优点是,它提供了一个潜在的具有多个预测标签的预测区域,与标准 QSAR 建模算法的单值(回归)或单标签(分类)输出预测形成对比。标准的保形预测可能不适合不平衡数据集。因此,引入了蒙地卡罗交叉保形预测(MCCP),它将蒙地卡罗归纳保形预测与交叉折叠校准集相结合。在这项研究中,MCCP 方法应用于 18 个公开可用的数据集,这些数据集具有从 1:10 到 1:1000 不等的各种不平衡水平(活性/非活性化合物的比例)。我们的结果表明,MCCP 通常在具有各种不平衡水平的生物活性数据集上表现良好。更重要的是,该方法不仅提供了预测置信度和预测区域,与标准机器学习方法相比,还为少数类提供了有效的预测。此外,还研究了基于化合物相似性的不合规度量。我们的结果表明,尽管它给出了有效的预测,但它的效率比依赖模型的指标差得多。

相似文献

1
Applying Mondrian Cross-Conformal Prediction To Estimate Prediction Confidence on Large Imbalanced Bioactivity Data Sets.将 Mondrian 交叉保形预测应用于大型不平衡生物活性数据集的预测置信度估计。
J Chem Inf Model. 2017 Jul 24;57(7):1591-1598. doi: 10.1021/acs.jcim.7b00159. Epub 2017 Jun 30.
2
Conformal Regression for Quantitative Structure-Activity Relationship Modeling-Quantifying Prediction Uncertainty.定量构效关系建模的保形回归——量化预测不确定性。
J Chem Inf Model. 2018 May 29;58(5):1132-1140. doi: 10.1021/acs.jcim.8b00054. Epub 2018 May 10.
3
Multitask Modeling with Confidence Using Matrix Factorization and Conformal Prediction.使用矩阵分解和一致性预测进行置信度下的多任务建模。
J Chem Inf Model. 2019 Apr 22;59(4):1598-1604. doi: 10.1021/acs.jcim.9b00027. Epub 2019 Apr 5.
4
Predicting With Confidence: Using Conformal Prediction in Drug Discovery.有信心的预测:在药物发现中使用一致性预测。
J Pharm Sci. 2021 Jan;110(1):42-49. doi: 10.1016/j.xphs.2020.09.055. Epub 2020 Oct 17.
5
Introducing conformal prediction in predictive modeling. A transparent and flexible alternative to applicability domain determination.在预测建模中引入共形预测。一种用于适用性域确定的透明且灵活的替代方法。
J Chem Inf Model. 2014 Jun 23;54(6):1596-603. doi: 10.1021/ci5001168. Epub 2014 May 21.
6
Conformal Prediction Classification of a Large Data Set of Environmental Chemicals from ToxCast and Tox21 Estrogen Receptor Assays.来自ToxCast和Tox21雌激素受体检测的大量环境化学物质数据集的共形预测分类
Chem Res Toxicol. 2016 Jun 20;29(6):1003-10. doi: 10.1021/acs.chemrestox.6b00037. Epub 2016 May 13.
7
The Relative Importance of Domain Applicability Metrics for Estimating Prediction Errors in QSAR Varies with Training Set Diversity.域适用性指标在估计 QSAR 预测误差方面的相对重要性随训练集多样性而变化。
J Chem Inf Model. 2015 Jun 22;55(6):1098-107. doi: 10.1021/acs.jcim.5b00110. Epub 2015 Jun 4.
8
Introducing conformal prediction in predictive modeling for regulatory purposes. A transparent and flexible alternative to applicability domain determination.在用于监管目的的预测建模中引入共形预测。一种用于适用性域确定的透明且灵活的替代方法。
Regul Toxicol Pharmacol. 2015 Mar;71(2):279-84. doi: 10.1016/j.yrtph.2014.12.021. Epub 2015 Jan 2.
9
Dynamic applicability domain (dAD): compound-target binding affinity estimates with local conformal prediction.动态适用域 (dAD):基于局部共形预测的化合物-靶标结合亲和力估计。
Bioinformatics. 2023 Aug 1;39(8). doi: 10.1093/bioinformatics/btad465.
10
Development and Evaluation of Conformal Prediction Methods for Quantitative Structure-Activity Relationship.定量构效关系的共形预测方法的开发与评估
ACS Omega. 2024 Jun 27;9(27):29478-29490. doi: 10.1021/acsomega.4c02017. eCollection 2024 Jul 9.

引用本文的文献

1
Functional protein mining with conformal guarantees.具有共形保证的功能蛋白质挖掘。
Nat Commun. 2025 Jan 2;16(1):85. doi: 10.1038/s41467-024-55676-y.
2
Low concentration cell painting images enable the identification of highly potent compounds.低浓度细胞染色图像可用于鉴定高效化合物。
Sci Rep. 2024 Oct 17;14(1):24403. doi: 10.1038/s41598-024-75401-5.
3
CPSign: conformal prediction for cheminformatics modeling.CPSign:用于化学信息学建模的共形预测
J Cheminform. 2024 Jun 28;16(1):75. doi: 10.1186/s13321-024-00870-9.
4
Leveraging Cell Painting Images to Expand the Applicability Domain and Actively Improve Deep Learning Quantitative Structure-Activity Relationship Models.利用细胞染色图像扩展适用范围并积极改进深度学习定量构效关系模型。
Chem Res Toxicol. 2023 Jul 17;36(7):1028-1036. doi: 10.1021/acs.chemrestox.2c00404. Epub 2023 Jun 16.
5
Predicting Endocrine Disruption Using Conformal Prediction - A Prioritization Strategy to Identify Hazardous Chemicals with Confidence.利用保角预测进行内分泌干扰物预测 - 一种具有置信度识别有害化学物质的优先级策略。
Chem Res Toxicol. 2023 Jan 16;36(1):53-65. doi: 10.1021/acs.chemrestox.2c00267. Epub 2022 Dec 19.
6
A universal similarity based approach for predictive uncertainty quantification in materials science.基于通用相似性的材料科学预测不确定性量化方法。
Sci Rep. 2022 Sep 2;12(1):14931. doi: 10.1038/s41598-022-19205-5.
7
In silico toxicology: From structure-activity relationships towards deep learning and adverse outcome pathways.计算机毒理学:从构效关系到深度学习及不良结局途径。
Wiley Interdiscip Rev Comput Mol Sci. 2020 Jul-Aug;10(4):e1475. doi: 10.1002/wcms.1475. Epub 2020 Mar 31.
8
The effect of noise on the predictive limit of QSAR models.噪声对定量构效关系(QSAR)模型预测极限的影响。
J Cheminform. 2021 Nov 25;13(1):92. doi: 10.1186/s13321-021-00571-7.
9
Deep Neural Networks for QSAR.深度学习方法在定量构效关系中的应用。
Methods Mol Biol. 2022;2390:233-260. doi: 10.1007/978-1-0716-1787-8_10.
10
Translating polygenic risk scores for clinical use by estimating the confidence bounds of risk prediction.通过估计风险预测的置信区间来为临床使用翻译多基因风险评分。
Nat Commun. 2021 Sep 6;12(1):5276. doi: 10.1038/s41467-021-25014-7.