Suppr超能文献

GASP:用于电子健康记录的基于图的近似序列模式挖掘

GASP: Graph-based Approximate Sequential Pattern Mining for Electronic Health Records.

作者信息

Dong Wenqin, Lee Eric W, Hertzberg Vicki Stover, Simpson Roy L, Ho Joyce C

机构信息

Carnegie Mellon University.

Emory University.

出版信息

Adv Databases Inf Syst. 2021 Aug;1450:50-60. doi: 10.1007/978-3-030-85082-1_5. Epub 2021 Jul 17.

Abstract

Sequential pattern mining can be used to extract meaningful sequences from electronic health records. However, conventional sequential pattern mining algorithms that discover all frequent sequential patterns can incur a high computational and be susceptible to noise in the observations. Approximate sequential pattern mining techniques have been introduced to address these shortcomings yet, existing approximate methods fail to reflect the true frequent sequential patterns or only target single-item event sequences. Multi-item event sequences are prominent in healthcare as a patient can have multiple interventions for a single visit. To alleviate these issues, we propose GASP, a graph-based approximate sequential pattern mining, that discovers frequent patterns for multi-item event sequences. Our approach compresses the sequential information into a concise graph structure which has computational benefits. The empirical results on two healthcare datasets suggest that GASP outperforms existing approximate models by improving recoverability and extracts better predictive patterns.

摘要

序列模式挖掘可用于从电子健康记录中提取有意义的序列。然而,发现所有频繁序列模式的传统序列模式挖掘算法可能会产生高计算量,并且容易受到观测数据中噪声的影响。为了解决这些缺点,人们引入了近似序列模式挖掘技术,但现有的近似方法无法反映真正的频繁序列模式,或者仅针对单项目事件序列。在医疗保健领域,多项目事件序列很突出,因为患者在一次就诊中可能会有多种干预措施。为了缓解这些问题,我们提出了GASP,一种基于图的近似序列模式挖掘方法,它可以发现多项目事件序列的频繁模式。我们的方法将序列信息压缩成一个简洁的图结构,这具有计算优势。在两个医疗数据集上的实证结果表明,GASP通过提高可恢复性优于现有的近似模型,并能提取出更好的预测模式。

相似文献

1
GASP: Graph-based Approximate Sequential Pattern Mining for Electronic Health Records.
Adv Databases Inf Syst. 2021 Aug;1450:50-60. doi: 10.1007/978-3-030-85082-1_5. Epub 2021 Jul 17.
2
FuzzyGap: Sequential Pattern Mining for Predicting Chronic Heart Failure in Clinical Pathways.
AMIA Jt Summits Transl Sci Proc. 2019 May 6;2019:222-231. eCollection 2019.
3
Mining actionable combined high utility incremental and associated sequential patterns.
PLoS One. 2023 Mar 29;18(3):e0283365. doi: 10.1371/journal.pone.0283365. eCollection 2023.
4
Event prediction from news text using subgraph embedding and graph sequence mining.
World Wide Web. 2022;25(6):2403-2428. doi: 10.1007/s11280-021-01002-1. Epub 2022 Feb 28.
5
Graph-based biomedical text summarization: An itemset mining and sentence clustering approach.
J Biomed Inform. 2018 Aug;84:42-58. doi: 10.1016/j.jbi.2018.06.005. Epub 2018 Jun 15.
6
Application of gap-constraints given sequential frequent pattern mining for protein function prediction.
Osong Public Health Res Perspect. 2015 Apr;6(2):112-20. doi: 10.1016/j.phrp.2015.01.006. Epub 2015 Feb 24.
7
Mining significant high utility gene regulation sequential patterns.
BMC Syst Biol. 2017 Dec 14;11(Suppl 6):109. doi: 10.1186/s12918-017-0475-4.
8
MACFP: Maximal Approximate Consecutive Frequent Pattern Mining under Edit Distance.
Proc SIAM Int Conf Data Min. 2016 May;2016:558-566. doi: 10.1137/1.9781611974348.63.
9
NetNMSP: Nonoverlapping maximal sequential pattern mining.
Appl Intell (Dordr). 2022;52(9):9861-9884. doi: 10.1007/s10489-021-02912-3. Epub 2022 Jan 10.
10
Top-k Self-Adaptive Contrast Sequential Pattern Mining.
IEEE Trans Cybern. 2022 Nov;52(11):11819-11833. doi: 10.1109/TCYB.2021.3082114. Epub 2022 Oct 17.

引用本文的文献

1
Sequential data mining of infection patterns as predictors for onset of type 1 diabetes in genetically at-risk individuals.
J Biomed Inform. 2023 Jun;142:104385. doi: 10.1016/j.jbi.2023.104385. Epub 2023 May 9.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验