• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

数据挖掘中的索引提升与流行病学研究中的关联度量相对风险密切相关。

The index lift in data mining has a close relationship with the association measure relative risk in epidemiological studies.

机构信息

University of Alberta School of Public Health, Edmonton, AB, Canada.

Department of Computing Science, University of Alberta, Edmonton, AB, Canada.

出版信息

BMC Med Inform Decis Mak. 2019 Jun 17;19(1):112. doi: 10.1186/s12911-019-0838-4.

DOI:10.1186/s12911-019-0838-4
PMID:31208407
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6580490/
Abstract

BACKGROUND

Data mining tools have been increasingly used in health research, with the promise of accelerating discoveries. Lift is a standard association metric in the data mining community. However, health researchers struggle with the interpretation of lift. As a result, dissemination of data mining results can be met with hesitation. The relative risk and odds ratio are standard association measures in the health domain, due to their straightforward interpretation and comparability across populations. We aimed to investigate the lift-relative risk and the lift-odds ratio relationships, and provide tools to convert lift to the relative risk and odds ratio.

METHODS

We derived equations linking lift-relative risk and lift-odds ratio. We discussed how lift, relative risk, and odds ratio behave numerically with varying association strengths and exposure prevalence levels. The lift-relative risk relationship was further illustrated using a high-dimensional dataset which examines the association of exposure to airborne pollutants and adverse birth outcomes. We conducted spatial association rule mining using the Kingfisher algorithm, which identified association rules using its built-in lift metric. We directly estimated relative risks and odds ratios from 2 by 2 tables for each identified rule. These values were compared to the corresponding lift values, and relative risks and odds ratios were computed using the derived equations.

RESULTS

As the exposure-outcome association strengthens, the odds ratio and relative risk move away from 1 faster numerically than lift, i.e. |log (odds ratio)| ≥ |log (relative risk)| ≥ |log (lift)|. In addition, lift is bounded by the smaller of the inverse probability of outcome or exposure, i.e. lift≤ min (1/P(O), 1/P(E)). Unlike the relative risk and odds ratio, lift depends on the exposure prevalence for fixed outcomes. For example, when an exposure A and a less prevalent exposure B have the same relative risk for an outcome, exposure A has a lower lift than B.

CONCLUSIONS

Lift, relative risk, and odds ratio are positively correlated and share the same null value. However, lift depends on the exposure prevalence, and thus is not straightforward to interpret or to use to compare association strength. Tools are provided to obtain the relative risk and odds ratio from lift.

摘要

背景

数据挖掘工具在健康研究中得到了越来越多的应用,有望加速发现。提升是数据挖掘领域中的一个标准关联度量。然而,健康研究人员在解释提升时遇到了困难。因此,数据挖掘结果的传播可能会犹豫不决。相对风险和优势比是健康领域的标准关联度量,因为它们的解释简单,并且在不同人群之间具有可比性。我们旨在研究提升-相对风险和提升-优势比之间的关系,并提供将提升转换为相对风险和优势比的工具。

方法

我们推导出了将提升-相对风险和提升-优势比联系起来的方程。我们讨论了在不同关联强度和暴露流行水平下,提升、相对风险和优势比在数值上的表现。我们使用一个高维数据集进一步说明了提升-相对风险关系,该数据集研究了暴露于空气污染物与不良出生结果之间的关联。我们使用 Kingfisher 算法进行空间关联规则挖掘,该算法使用其内置的提升度量来识别关联规则。我们直接从每个识别出的规则的 2x2 表中估计相对风险和优势比。将这些值与相应的提升值进行比较,并使用推导的方程计算相对风险和优势比。

结果

随着暴露-结果关联的增强,优势比和相对风险在数值上比提升更快地远离 1,即|log(优势比)|≥|log(相对风险)|≥|log(提升)|。此外,提升受结果或暴露的逆概率较小限制,即提升≤min(1/P(O),1/P(E))。与相对风险和优势比不同,提升取决于固定结果的暴露流行率。例如,当暴露 A 和不太流行的暴露 B 对结果具有相同的相对风险时,暴露 A 的提升低于 B。

结论

提升、相对风险和优势比呈正相关,具有相同的零值。然而,提升取决于暴露流行率,因此解释起来并不简单,也不便于用于比较关联强度。提供了从提升中获得相对风险和优势比的工具。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0f61/6580490/65a5f848e618/12911_2019_838_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0f61/6580490/01ac2ab84267/12911_2019_838_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0f61/6580490/3e584a9aaf57/12911_2019_838_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0f61/6580490/65a5f848e618/12911_2019_838_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0f61/6580490/01ac2ab84267/12911_2019_838_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0f61/6580490/3e584a9aaf57/12911_2019_838_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0f61/6580490/65a5f848e618/12911_2019_838_Fig3_HTML.jpg

相似文献

1
The index lift in data mining has a close relationship with the association measure relative risk in epidemiological studies.数据挖掘中的索引提升与流行病学研究中的关联度量相对风险密切相关。
BMC Med Inform Decis Mak. 2019 Jun 17;19(1):112. doi: 10.1186/s12911-019-0838-4.
2
Interdisciplinary-driven hypotheses on spatial associations of mixtures of industrial air pollutants with adverse birth outcomes.跨学科驱动假说:工业空气污染物混合物与不良出生结局的空间关联。
Environ Int. 2019 Oct;131:104972. doi: 10.1016/j.envint.2019.104972. Epub 2019 Jul 9.
3
Pediatric population health analysis of southern and central Illinois region: A cross sectional retrospective study using association rule mining and multiple logistic regression.使用关联规则挖掘和多项逻辑回归分析伊利诺伊州南部和中部地区儿科人群健康分析:一项横断面回顾性研究。
Comput Methods Programs Biomed. 2019 Sep;178:145-153. doi: 10.1016/j.cmpb.2019.06.020. Epub 2019 Jun 18.
4
Association between air pollution and low birth weight: a community-based study.空气污染与低出生体重之间的关联:一项基于社区的研究。
Environ Health Perspect. 1997 May;105(5):514-20. doi: 10.1289/ehp.97105514.
5
Associations between maternal residential proximity to air emissions from industrial facilities and low birth weight in Texas, USA.美国德克萨斯州母亲居住地与工业设施空气排放物之间的关系与低出生体重有关。
Environ Int. 2018 Nov;120:181-198. doi: 10.1016/j.envint.2018.07.045. Epub 2018 Aug 7.
6
Maternal exposure to air pollutants during the first trimester and foetal growth in Japanese term infants.日本足月儿母亲在孕早期暴露于空气污染物与胎儿生长情况
Environ Pollut. 2017 Nov;230:387-393. doi: 10.1016/j.envpol.2017.06.069. Epub 2017 Jul 1.
7
Association between ambient particulate matter concentration and fetal growth restriction stratified by maternal employment.大气颗粒物浓度与胎儿生长受限的关联,按产妇就业状况分层。
BMC Pregnancy Childbirth. 2019 Jul 15;19(1):246. doi: 10.1186/s12884-019-2401-9.
8
Air pollution and very low birth weight infants: a target population?空气污染与极低出生体重儿:一个目标人群?
Pediatrics. 2006 Jul;118(1):156-64. doi: 10.1542/peds.2005-2432.
9
Association between ambient fine particulate matter and preterm birth or term low birth weight: An updated systematic review and meta-analysis.大气细颗粒物与早产或足月低出生体重的关系:一项更新的系统评价和荟萃分析。
Environ Pollut. 2017 Aug;227:596-605. doi: 10.1016/j.envpol.2017.03.055. Epub 2017 Apr 28.
10
Modeling spatial effects of PM(2.5) on term low birth weight in Los Angeles County.建立洛杉矶县 PM(2.5)对早产低体重儿的空间效应模型。
Environ Res. 2015 Oct;142:354-64. doi: 10.1016/j.envres.2015.06.044. Epub 2015 Jul 18.

引用本文的文献

1
Characterizing co-purchased food products with soda, fresh fruits, and fresh vegetables using loyalty card purchasing data in Montréal, Canada, 2015-2017.利用2015 - 2017年加拿大蒙特利尔的忠诚卡购买数据,对与苏打水、新鲜水果和新鲜蔬菜共同购买的食品进行特征分析。
Int J Behav Nutr Phys Act. 2025 Feb 17;22(1):19. doi: 10.1186/s12966-024-01701-8.
2
Analyzing collaborations in clinical trials in Korea using association rule mining.运用关联规则挖掘分析韩国临床试验中的合作情况。
Transl Clin Pharmacol. 2024 Dec;32(4):177-186. doi: 10.12793/tcp.2024.32.e17. Epub 2024 Dec 16.

本文引用的文献

1
A systematic review of data mining and machine learning for air pollution epidemiology.空气污染流行病学中数据挖掘与机器学习的系统综述。
BMC Public Health. 2017 Nov 28;17(1):907. doi: 10.1186/s12889-017-4914-3.
2
Mining disease risk patterns from nationwide clinical databases for the assessment of early rheumatoid arthritis risk.从全国临床数据库中挖掘疾病风险模式以评估早期类风湿性关节炎风险。
PLoS One. 2015 Apr 13;10(4):e0122508. doi: 10.1371/journal.pone.0122508. eCollection 2015.
3
Identifying the association rules between clinicopathologic factors and higher survival performance in operation-centric oral cancer patients using the Apriori algorithm.
使用 Apriori 算法识别以手术为中心的口腔癌患者的临床病理因素与更高生存性能之间的关联规则。
Biomed Res Int. 2013;2013:359634. doi: 10.1155/2013/359634. Epub 2013 Jul 25.
4
Discovering medical knowledge using association rule mining in young adults with acute myocardial infarction.使用关联规则挖掘在年轻急性心肌梗死患者中发现医学知识。
J Med Syst. 2013 Apr;37(2):9896. doi: 10.1007/s10916-012-9896-1. Epub 2013 Jan 15.
5
Exploration of the association rules mining technique for the signal detection of adverse drug events in spontaneous reporting systems.探索关联规则挖掘技术在自发报告系统中药物不良事件信号检测中的应用。
PLoS One. 2012;7(7):e40561. doi: 10.1371/journal.pone.0040561. Epub 2012 Jul 16.
6
Data mining in healthcare and biomedicine: a survey of the literature.医疗保健和生物医学中的数据挖掘:文献综述。
J Med Syst. 2012 Aug;36(4):2431-48. doi: 10.1007/s10916-011-9710-5. Epub 2011 May 3.
7
Odds ratios and risk ratios: what's the difference and why does it matter?比值比和风险比:有何区别以及为何重要?
South Med J. 2008 Jul;101(7):730-4. doi: 10.1097/SMJ.0b013e31817a7ee4.
8
Making sense of odds and odds ratios.理解比值和比值比。
Obstet Gynecol. 2008 Feb;111(2 Pt 1):423-6. doi: 10.1097/01.AOG.0000297304.32187.5d.
9
Causal inference based on counterfactuals.基于反事实的因果推断。
BMC Med Res Methodol. 2005 Sep 13;5:28. doi: 10.1186/1471-2288-5-28.
10
Data mining applications in healthcare.医疗保健中的数据挖掘应用。
J Healthc Inf Manag. 2005 Spring;19(2):64-72.