Suppr超能文献

CauRuler:用于复杂患者轨迹建模的因果非冗余关联规则挖掘器。

CauRuler: Causal irredundant association rule miner for complex patient trajectory modelling.

机构信息

eXiT Research Group, Universitat de Girona (UdG), EPS - Edifici P-IV, Carrer Universitat de Girona, 6, Girona, 17003, Catalunya, Spain; Assistance strategy management. Hospital Germans Trias i Pujol, (ICS), Carretera de Canyet, Badalona, 08916, Catalunya, Spain; Research Group on Innovation, Health Economics and Digital Transformation, Institut Germans Trias i Pujol (IGTP), Cami de les Escoles, Badalona, 08916, Catalunya, Spain.

Assistance strategy management. Hospital Germans Trias i Pujol, (ICS), Carretera de Canyet, Badalona, 08916, Catalunya, Spain; Research Group on Innovation, Health Economics and Digital Transformation, Institut Germans Trias i Pujol (IGTP), Cami de les Escoles, Badalona, 08916, Catalunya, Spain.

出版信息

Comput Biol Med. 2023 Mar;155:106636. doi: 10.1016/j.compbiomed.2023.106636. Epub 2023 Feb 9.

Abstract

BACKGROUND AND OBJECTIVES

Discovering causal associations between variables is one of the main goals of clinical trials, with the ultimate aim of identifying the causes of specific health status. Prior knowledge of causal paths could help ensure patients do not develop the resultant conditions. In recent years, thanks to the enormous amount of health data stored with the support of digital tools, attempts have been made to employ Machine Learning to infer causality. Those methodologies suffer from some deficiencies in controlling cofounders when analysing causality, as well as providing causal rules general enough to be useful in healthcare practice. Conversely, this work presents and evaluates CauRuler, a new approach to deal with causality from association rules. The proposed approach uses a pruning strategy to reduce the association rule set, which does not compromise the causality learning capability of the algorithm. This behaviour makes the algorithm suitable for exploiting large health databases with thousands of patients and medical instances. CauRuler can control a larger number of confounders than other proposals, bringing robustness to causal analysis and avoiding the identification of spurious associations. Additionally, the method generalizes causality using anti-monotone properties to obtain complex and general causal paths. The method can target correct causal associations in complex medical databases with retrospective data.

METHOD

CauRuler extends association rule mining with an irredundancy property so that the set of rules learnt is reduced in size and generalized. General association rules, conformed by fewer items, enable controlling more confounding variables to verify, with more statistical evidence on available data, if they represent causal paths in patient disease trajectories.

RESULTS

CauRuler has been tested on a complex real medical database (3,5 M visits to the primary care services between 2019 and 2020, and controlling over 15.000 different variables including diagnoses and demographic and other clinical patient data). The reduction of the rule set achieved by the pruning strategy goes from 7.732 to 2.240 rules, from which 46 have been found to have causality relationships in the patient trajectories, and generalized to 14 rules tested as true causal relationships thanks to the confounding analysis. These rules have been validated by clinicians with the support of a graphical map. The obtained causal paths control in average of 906 confounder variables, retrieving robust results.

CONCLUSIONS

Causal relationships enable predicting causal paths between health conditions according to patient trajectories. Knowing these causal paths is crucial for understanding and preventing the appearance or worsening of diseases in patients. CauRuler, with high demanding thresholds, has proven its efficiency and effectiveness in targeting previously known causal associations between diagnoses, reaching consensus in the medical community. Softening these thresholds should help target interesting general causal paths.

摘要

背景和目的

发现变量之间的因果关系是临床试验的主要目标之一,最终目的是确定特定健康状况的原因。对因果路径的先验知识可以帮助确保患者不会出现相关后果。近年来,由于数字工具支持下存储了大量健康数据,人们尝试利用机器学习来推断因果关系。这些方法在分析因果关系时控制混杂因素存在一些缺陷,并且提供的因果规则不够通用,无法在医疗实践中使用。相反,这项工作提出并评估了 CauRuler,这是一种从关联规则中处理因果关系的新方法。该方法使用一种修剪策略来减少关联规则集,这不会影响算法的因果学习能力。这种行为使算法适用于利用包含数千名患者和医疗实例的大型健康数据库。CauRuler 可以控制比其他方法更多的混杂因素,从而为因果分析带来稳健性并避免识别虚假关联。此外,该方法使用单调递减属性来推广因果关系,以获得复杂且通用的因果路径。该方法可以针对具有回顾性数据的复杂医疗数据库中的正确因果关联。

方法

CauRuler 通过一个不可约属性扩展关联规则挖掘,以便学习的规则集在大小和通用性上得到缩减。包含较少项的一般关联规则可以控制更多的混杂变量,以便在可用数据上用更多的统计证据来验证它们是否代表患者疾病轨迹中的因果路径。

结果

CauRuler 已在一个复杂的真实医疗数据库(2019 年至 2020 年期间初级保健服务的 350 万次就诊,控制超过 15000 个不同的变量,包括诊断和人口统计学以及其他临床患者数据)上进行了测试。修剪策略实现的规则集缩减幅度从 7732 条规则缩减到 2240 条规则,其中 46 条规则在患者轨迹中具有因果关系,并通过混杂分析推广到 14 条经测试为真实因果关系的规则。这些规则已得到临床医生的支持,通过图形映射进行了验证。获得的因果路径平均控制 906 个混杂变量,检索出稳健的结果。

结论

因果关系能够根据患者轨迹预测健康状况之间的因果路径。了解这些因果关系对于理解和预防患者疾病的出现或恶化至关重要。CauRuler 具有较高的要求阈值,已证明其在针对诊断之间已知因果关联方面的效率和有效性,在医学界达成共识。放宽这些阈值应有助于针对有趣的一般因果路径。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验