• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

在临床领域使用诊断比值比选择模式以构建可解释的基于模式的分类器:多变量序列模式挖掘研究

Using the Diagnostic Odds Ratio to Select Patterns to Build an Interpretable Pattern-Based Classifier in a Clinical Domain: Multivariate Sequential Pattern Mining Study.

作者信息

Casanova Isidoro J, Campos Manuel, Juarez Jose M, Gomariz Antonio, Lorente-Ros Marta, Lorente Jose A

机构信息

AIKE Research Team (INTICO), Computer Science Faculty, University of Murcia, Murcia, Spain.

Murcian Bio-Health Institute (IMIB-Arrixaca), Murcia, Spain.

出版信息

JMIR Med Inform. 2022 Aug 10;10(8):e32319. doi: 10.2196/32319.

DOI:10.2196/32319
PMID:35947437
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9403826/
Abstract

BACKGROUND

It is important to exploit all available data on patients in settings such as intensive care burn units (ICBUs), where several variables are recorded over time. It is possible to take advantage of the multivariate patterns that model the evolution of patients to predict their survival. However, pattern discovery algorithms generate a large number of patterns, of which only some are relevant for classification.

OBJECTIVE

We propose to use the diagnostic odds ratio (DOR) to select multivariate sequential patterns used in the classification in a clinical domain, rather than employing frequency properties.

METHODS

We used data obtained from the ICBU at the University Hospital of Getafe, where 6 temporal variables for 465 patients were registered every day during 5 days, and to model the evolution of these clinical variables, we used multivariate sequential patterns by applying 2 different discretization methods for the continuous attributes. We compared 4 ways in which to employ the DOR for pattern selection: (1) we used it as a threshold to select patterns with a minimum DOR; (2) we selected patterns whose differential DORs are higher than a threshold with regard to their extensions; (3) we selected patterns whose DOR CIs do not overlap; and (4) we proposed the combination of threshold and nonoverlapping CIs to select the most discriminative patterns. As a baseline, we compared our proposals with Jumping Emerging Patterns, one of the most frequently used techniques for pattern selection that utilizes frequency properties.

RESULTS

We have compared the number and length of the patterns eventually selected, classification performance, and pattern and model interpretability. We show that discretization has a great impact on the accuracy of the classification model, but that a trade-off must be found between classification accuracy and the physicians' capacity to interpret the patterns obtained. We have also identified that the experiments combining threshold and nonoverlapping CIs (Option 4) obtained the fewest number of patterns but also with the smallest size, thus implying the loss of an acceptable accuracy with regard to clinician interpretation. The best classification model according to the trade-off is a JRIP classifier with only 5 patterns (20 items) that was built using unsupervised correlation preserving discretization and differential DOR in a beam search for the best pattern. It achieves a specificity of 56.32% and an area under the receiver operating characteristic curve of 0.767.

CONCLUSIONS

A method for the classification of patients' survival can benefit from the use of sequential patterns, as these patterns consider knowledge about the temporal evolution of the variables in the case of ICBU. We have proved that the DOR can be used in several ways, and that it is a suitable measure to select discriminative and interpretable quality patterns.

摘要

背景

在重症监护烧伤病房(ICBU)等环境中,利用患者的所有可用数据非常重要,在这些环境中会随时间记录多个变量。利用对患者病情演变进行建模的多变量模式来预测他们的生存情况是可行的。然而,模式发现算法会生成大量模式,其中只有一些与分类相关。

目的

我们建议使用诊断比值比(DOR)来选择临床领域分类中使用的多变量序列模式,而不是采用频率属性。

方法

我们使用了从赫塔费大学医院的重症监护烧伤病房获得的数据,在5天时间里,每天记录465名患者的6个时间变量,为了对这些临床变量的演变进行建模,我们通过对连续属性应用2种不同的离散化方法来使用多变量序列模式。我们比较了使用DOR进行模式选择的4种方法:(1)将其用作阈值来选择具有最小DOR的模式;(2)选择其扩展的差异DOR高于阈值的模式;(3)选择其DOR置信区间不重叠的模式;(4)我们提出将阈值和非重叠置信区间相结合来选择最具判别力的模式。作为基线,我们将我们的提议与跳跃新兴模式进行了比较,跳跃新兴模式是最常用的利用频率属性进行模式选择的技术之一。

结果

我们比较了最终选择的模式的数量和长度、分类性能以及模式和模型的可解释性。我们表明离散化对分类模型的准确性有很大影响,但必须在分类准确性和医生解释所获得模式的能力之间找到平衡。我们还确定,结合阈值和非重叠置信区间的实验(选项4)获得的模式数量最少,但规模也最小,因此就临床医生的解释而言意味着损失了可接受的准确性。根据这种平衡,最佳分类模型是一个JRIP分类器,它仅使用5个模式(20个项目)构建,该分类器使用无监督相关保持离散化和在波束搜索最佳模式时的差异DOR。它实现了56.32%的特异性和0.767的受试者工作特征曲线下面积。

结论

一种用于患者生存分类的方法可以从使用序列模式中受益,因为这些模式考虑了重症监护烧伤病房案例中变量的时间演变知识。我们已经证明DOR可以以多种方式使用,并且它是选择有判别力且可解释的优质模式的合适度量。

相似文献

1
Using the Diagnostic Odds Ratio to Select Patterns to Build an Interpretable Pattern-Based Classifier in a Clinical Domain: Multivariate Sequential Pattern Mining Study.在临床领域使用诊断比值比选择模式以构建可解释的基于模式的分类器:多变量序列模式挖掘研究
JMIR Med Inform. 2022 Aug 10;10(8):e32319. doi: 10.2196/32319.
2
Surprising and novel multivariate sequential patterns using odds ratio for temporal evolution in healthcare.利用优势比探索医疗保健中时间演变的令人惊讶和新颖的多元序贯模式。
BMC Med Inform Decis Mak. 2024 Jun 13;24(1):165. doi: 10.1186/s12911-024-02566-4.
3
Top-k Self-Adaptive Contrast Sequential Pattern Mining.基于 top-k 的自适应对比序列模式挖掘。
IEEE Trans Cybern. 2022 Nov;52(11):11819-11833. doi: 10.1109/TCYB.2021.3082114. Epub 2022 Oct 17.
4
Classification of auditory brainstem responses through symbolic pattern discovery.通过符号模式发现对听觉脑干反应进行分类。
Artif Intell Med. 2016 Jun;70:12-30. doi: 10.1016/j.artmed.2016.05.001. Epub 2016 May 24.
5
DPClass: An Effective but Concise Discriminative Patterns-Based Classification Framework.DPClass:一种有效但简洁的基于判别模式的分类框架。
Proc SIAM Int Conf Data Min. 2016 May;2016:567-575. doi: 10.1137/1.9781611974348.64.
6
White box radial basis function classifiers with component selection for clinical prediction models.基于组件选择的白盒径向基函数分类器在临床预测模型中的应用。
Artif Intell Med. 2014 Jan;60(1):53-64. doi: 10.1016/j.artmed.2013.10.001. Epub 2013 Oct 18.
7
NetNMSP: Nonoverlapping maximal sequential pattern mining.NetNMSP:非重叠最大顺序模式挖掘。
Appl Intell (Dordr). 2022;52(9):9861-9884. doi: 10.1007/s10489-021-02912-3. Epub 2022 Jan 10.
8
Using sequential patterns as features for classification models to make accurate predictions on ICU events.使用序列模式作为分类模型的特征,以便对重症监护病房事件做出准确预测。
Annu Int Conf IEEE Eng Med Biol Soc. 2015 Aug;2015:8157-60. doi: 10.1109/EMBC.2015.7320287.
9
Applying sequential pattern mining to investigate cerebrovascular health outpatients' re-visit patterns.应用序列模式挖掘来研究脑血管疾病门诊患者的复诊模式。
PeerJ. 2018 Jul 9;6:e5183. doi: 10.7717/peerj.5183. eCollection 2018.
10
Multivariate Discretization Based on Evolutionary Cut Points Selection for Classification.基于进化切点选择的多元离散化分类。
IEEE Trans Cybern. 2016 Mar;46(3):595-608. doi: 10.1109/TCYB.2015.2410143. Epub 2015 Mar 18.

引用本文的文献

1
Identifying risk factors for Alzheimer's disease from multivariate longitudinal clinical data using temporal pattern mining.使用时间模式挖掘从多变量纵向临床数据中识别阿尔茨海默病的风险因素。
BMC Bioinformatics. 2025 Feb 17;26(1):56. doi: 10.1186/s12859-024-06018-8.
2
Surprising and novel multivariate sequential patterns using odds ratio for temporal evolution in healthcare.利用优势比探索医疗保健中时间演变的令人惊讶和新颖的多元序贯模式。
BMC Med Inform Decis Mak. 2024 Jun 13;24(1):165. doi: 10.1186/s12911-024-02566-4.

本文引用的文献

1
Analysis of correlation between pediatric asthma exacerbation and exposure to pollutant mixtures with association rule mining.基于关联规则挖掘的儿童哮喘急性加重与污染物混合物暴露之间的相关性分析
Artif Intell Med. 2016 Nov;74:44-52. doi: 10.1016/j.artmed.2016.11.003. Epub 2016 Nov 25.
2
Mining Recent Temporal Patterns for Event Detection in Multivariate Time Series Data.挖掘多元时间序列数据中用于事件检测的近期时间模式。
KDD. 2012;2012:280-288. doi: 10.1145/2339530.2339578.
3
Discriminative pattern mining and its applications in bioinformatics.判别模式挖掘及其在生物信息学中的应用。
Brief Bioinform. 2015 Sep;16(5):884-900. doi: 10.1093/bib/bbu042. Epub 2014 Nov 28.
4
Multi-objective evolutionary algorithms for fuzzy classification in survival prediction.多目标进化算法在生存预测中的模糊分类。
Artif Intell Med. 2014 Mar;60(3):197-219. doi: 10.1016/j.artmed.2013.12.006. Epub 2014 Jan 9.
5
Prognostic scoring systems in burns: a review.烧伤预后评分系统:综述。
Burns. 2011 Dec;37(8):1288-95. doi: 10.1016/j.burns.2011.07.017. Epub 2011 Sep 21.
6
Learning predictive models that use pattern discovery--a bootstrap evaluative approach applied in organ functioning sequences.学习使用模式发现的预测模型 - 一种在器官功能序列中应用的自举评估方法。
J Biomed Inform. 2010 Aug;43(4):578-86. doi: 10.1016/j.jbi.2010.03.004. Epub 2010 Mar 21.
7
Efficient discovery of risk patterns in medical data.医学数据中风险模式的高效发现。
Artif Intell Med. 2009 Jan;45(1):77-89. doi: 10.1016/j.artmed.2008.07.008. Epub 2008 Sep 9.
8
Discovery and integration of univariate patterns from daily individual organ-failure scores for intensive care mortality prediction.从每日个体器官衰竭评分中发现单变量模式并将其整合用于重症监护死亡率预测。
Artif Intell Med. 2008 May;43(1):47-60. doi: 10.1016/j.artmed.2008.01.002. Epub 2008 Apr 3.
9
The diagnostic odds ratio: a single indicator of test performance.诊断比值比:测试性能的单一指标。
J Clin Epidemiol. 2003 Nov;56(11):1129-35. doi: 10.1016/s0895-4356(03)00177-x.