在大型稀疏电子健康记录数据中进行时态条件模式挖掘：以小儿哮喘特征描述为例的研究

Temporal condition pattern mining in large, sparse electronic health record data: A case study in characterizing pediatric asthma.

机构信息

Department of Information Science, College of Computing & Informatics, Drexel University, Philadelphia, Pennsylvania, USA.

Department of Biomedical and Health Informatics, Children's Hospital of Philadelphia, Philadelphia, Pennsylvania, USA.

出版信息

J Am Med Inform Assoc. 2020 Apr 1;27(4):558-566. doi: 10.1093/jamia/ocaa005.

DOI:10.1093/jamia/ocaa005

PMID:32049282

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7075539/

Abstract

OBJECTIVE

This study introduces a temporal condition pattern mining methodology to address the sparse nature of coded condition concept utilization in electronic health record data. As a validation study, we applied this method to reveal condition patterns surrounding an initial diagnosis of pediatric asthma.

MATERIALS AND METHODS

The SPADE (Sequential PAttern Discovery using Equivalence classes) algorithm was used to identify common temporal condition patterns surrounding the initial diagnosis of pediatric asthma in a study population of 71 824 patients from the Children's Hospital of Philadelphia. SPADE was applied to a dataset with diagnoses coded using International Classification of Diseases (ICD) concepts and separately to a dataset with the ICD codes mapped to their corresponding expanded diagnostic clusters (EDCs). Common temporal condition patterns surrounding the initial diagnosis of pediatric asthma ascertained by SPADE from both the ICD and EDC datasets were compared.

RESULTS

SPADE identified 36 unique diagnoses in the mapped EDC dataset, whereas only 19 were recognized in the ICD dataset. Temporal trends in condition diagnoses ascertained from the EDC data were not discoverable in the ICD dataset.

DISCUSSION

Mining frequent temporal condition patterns from large electronic health record datasets may reveal previously unknown associations between diagnoses that could inform future research into causation or other relationships. Mapping sparsely coded medical concepts into homogenous groups was essential to discovering potentially useful information from our dataset.

CONCLUSIONS

We expect that the presented methodology is applicable to the study of diagnostic trajectories for other clinical conditions and can be extended to study temporal patterns of other coded medical concepts such as medications and procedures.

摘要

目的

本研究引入了一种时间条件模式挖掘方法，以解决电子健康记录数据中编码条件概念利用的稀疏性问题。作为验证研究，我们应用该方法揭示了围绕儿童哮喘初始诊断的条件模式。

材料和方法

使用 SPADE（使用等价类的序列模式发现）算法在费城儿童医院的 71824 名患者的研究人群中，确定了儿童哮喘初始诊断周围常见的时间条件模式。SPADE 应用于使用国际疾病分类（ICD）概念编码的数据集，以及将 ICD 代码映射到其相应的扩展诊断集群（EDC）的数据集。比较 SPADE 从 ICD 和 EDC 数据集确定的围绕儿童哮喘初始诊断的常见时间条件模式。

结果

SPADE 在映射的 EDC 数据集中识别出 36 个独特的诊断，而在 ICD 数据集中仅识别出 19 个。从 EDC 数据中确定的条件诊断的时间趋势在 ICD 数据集中不可发现。

讨论

从大型电子健康记录数据集挖掘常见的时间条件模式可能会揭示以前未知的诊断之间的关联，这可以为未来的因果关系或其他关系研究提供信息。将稀疏编码的医学概念映射到同质组是从我们的数据集发现潜在有用信息的关键。

结论

我们预计所提出的方法适用于其他临床条件的诊断轨迹研究，并可扩展到研究其他编码医学概念（如药物和程序）的时间模式。

相似文献

Temporal condition pattern mining in large, sparse electronic health record data: A case study in characterizing pediatric asthma.在大型稀疏电子健康记录数据中进行时态条件模式挖掘：以小儿哮喘特征描述为例的研究

J Am Med Inform Assoc. 2020 Apr 1;27(4):558-566. doi: 10.1093/jamia/ocaa005.

Identification of temporal condition patterns associated with pediatric obesity incidence using sequence mining and big data.基于序列挖掘和大数据识别与儿童肥胖发生率相关的时间状态模式。

Int J Obes (Lond). 2020 Aug;44(8):1753-1765. doi: 10.1038/s41366-020-0614-7. Epub 2020 Jun 3.

Improvement of the quality of medical databases: data-mining-based prediction of diagnostic codes from previous patient codes.医学数据库质量的提升：基于数据挖掘从既往患者编码预测诊断编码

Stud Health Technol Inform. 2015;210:419-23.

High-throughput phenotyping with temporal sequences.高通量表型分析与时间序列。

J Am Med Inform Assoc. 2021 Mar 18;28(4):772-781. doi: 10.1093/jamia/ocaa288.

A potential causal association mining algorithm for screening adverse drug reactions in postmarketing surveillance.一种用于上市后监测中筛选药物不良反应的潜在因果关联挖掘算法。

IEEE Trans Inf Technol Biomed. 2011 May;15(3):428-37. doi: 10.1109/TITB.2011.2131669. Epub 2011 Mar 24.

Medical temporal-knowledge discovery via temporal abstraction.通过时间抽象进行医学时间知识发现。

AMIA Annu Symp Proc. 2009 Nov 14;2009:452-6.

Patient ranking with temporally annotated data.基于时间标注数据的患者排序。

J Biomed Inform. 2018 Feb;78:43-53. doi: 10.1016/j.jbi.2017.12.007. Epub 2017 Dec 19.

Temporal event sequence simplification.时间事件序列简化。

IEEE Trans Vis Comput Graph. 2013 Dec;19(12):2227-36. doi: 10.1109/TVCG.2013.200.

A framework for mining signatures from event sequences and its applications in healthcare data.从事件序列中挖掘特征的框架及其在医疗保健数据中的应用。

IEEE Trans Pattern Anal Mach Intell. 2013 Feb;35(2):272-85. doi: 10.1109/TPAMI.2012.111.

Discovering metric temporal constraint networks on temporal databases.发现时态数据库上的度量时态约束网络。

Artif Intell Med. 2013 Jul;58(3):139-54. doi: 10.1016/j.artmed.2013.03.006. Epub 2013 May 6.

引用本文的文献

Identifying time patterns in Huntington's disease trajectories using dynamic time warping-based clustering on multi-modal data.利用基于动态时间规整的多模态数据聚类方法识别亨廷顿舞蹈症病程中的时间模式。

Sci Rep. 2025 Jan 24;15(1):3081. doi: 10.1038/s41598-025-86686-5.

Conceptualizing bias in EHR data: A case study in performance disparities by demographic subgroups for a pediatric obesity incidence classifier.电子健康记录（EHR）数据中的偏差概念化：以儿科肥胖发病率分类器按人口亚组划分的性能差异为例

PLOS Digit Health. 2024 Oct 23;3(10):e0000642. doi: 10.1371/journal.pdig.0000642. eCollection 2024 Oct.

Exploring long-term breast cancer survivors' care trajectories using dynamic time warping-based unsupervised clustering.基于动态时间规整的无监督聚类探索长期乳腺癌幸存者的护理轨迹。

J Am Med Inform Assoc. 2024 Apr 3;31(4):820-831. doi: 10.1093/jamia/ocad251.

Leveraging Electronic Health Records for Guideline-Based Asthma Documentation.利用电子健康记录进行基于指南的哮喘文档记录。

J Allergy Clin Immunol Pract. 2023 Mar;11(3):855-862.e4. doi: 10.1016/j.jaip.2022.11.032. Epub 2022 Dec 12.

Network-medicine framework for studying disease trajectories in U.S. veterans.用于研究美国退伍军人疾病轨迹的网络医学框架。

Sci Rep. 2022 Jul 14;12(1):12018. doi: 10.1038/s41598-022-15764-9.

The applications of eHealth technologies in the management of asthma and allergic diseases.电子健康技术在哮喘和过敏性疾病管理中的应用。

Clin Transl Allergy. 2021 Sep 6;11(7):e12061. doi: 10.1002/clt2.12061. eCollection 2021 Aug.

本文引用的文献

Mining comorbidity patterns using retrospective analysis of big collection of outpatient records.利用对大量门诊记录的回顾性分析挖掘共病模式。

Health Inf Sci Syst. 2017 Sep 28;5(1):3. doi: 10.1007/s13755-017-0024-y. eCollection 2017 Dec.

Evaluating phecodes, clinical classification software, and ICD-9-CM codes for phenome-wide association studies in the electronic health record.评估电子健康记录中全表型关联研究的疾病编码、临床分类软件和国际疾病分类第九版临床修订本编码。

PLoS One. 2017 Jul 7;12(7):e0175508. doi: 10.1371/journal.pone.0175508. eCollection 2017.

Pediatric Asthma Health Disparities: Race, Hardship, Housing, and Asthma in a National Survey.儿童哮喘健康差异：一项全国性调查中的种族、困境、住房与哮喘

Acad Pediatr. 2017 Mar;17(2):127-134. doi: 10.1016/j.acap.2016.11.011. Epub 2016 Nov 19.

Diagnosis trajectories of prior multi-morbidity predict sepsis mortality.既往多病种的诊断轨迹可预测脓毒症死亡率。

Sci Rep. 2016 Nov 4;6:36624. doi: 10.1038/srep36624.

A Harmonized Data Quality Assessment Terminology and Framework for the Secondary Use of Electronic Health Record Data.电子健康记录数据二次使用的统一数据质量评估术语和框架。

EGEMS (Wash DC). 2016 Sep 11;4(1):1244. doi: 10.13063/2327-9214.1244. eCollection 2016.

Comorbidities of asthma in U.S. children.美国儿童哮喘的合并症。

Respir Med. 2016 Jul;116:34-40. doi: 10.1016/j.rmed.2016.05.008. Epub 2016 May 10.

Extracting Electronic Health Record Data in a Practice-Based Research Network: Processes to Support Translational Research across Diverse Practice Organizations.在基于实践的研究网络中提取电子健康记录数据：支持跨不同实践组织进行转化研究的流程。

EGEMS (Wash DC). 2016 Mar 29;4(2):1206. doi: 10.13063/2327-9214.1206. eCollection 2016.

Mining and exploring care pathways from electronic medical records with visual analytics.利用可视化分析从电子病历中挖掘和探索护理路径。

J Biomed Inform. 2015 Aug;56:369-78. doi: 10.1016/j.jbi.2015.06.020. Epub 2015 Jul 2.

Big data analytics in healthcare: promise and potential.医疗保健中的大数据分析：前景与潜力。

Health Inf Sci Syst. 2014 Feb 7;2:3. doi: 10.1186/2047-2501-2-3. eCollection 2014.

The extent and patterns of multiple chronic conditions in low-income children.低收入儿童多重慢性病的范围和模式。

Clin Pediatr (Phila). 2015 Apr;54(4):353-8. doi: 10.1177/0009922815574073.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

在大型稀疏电子健康记录数据中进行时态条件模式挖掘：以小儿哮喘特征描述为例的研究

Temporal condition pattern mining in large, sparse electronic health record data: A case study in characterizing pediatric asthma.

机构信息

出版信息

OBJECTIVE

MATERIALS AND METHODS

RESULTS

DISCUSSION

CONCLUSIONS

目的

材料和方法

结果

讨论

结论

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献