Suppr超能文献

在大型稀疏电子健康记录数据中进行时态条件模式挖掘:以小儿哮喘特征描述为例的研究

Temporal condition pattern mining in large, sparse electronic health record data: A case study in characterizing pediatric asthma.

机构信息

Department of Information Science, College of Computing & Informatics, Drexel University, Philadelphia, Pennsylvania, USA.

Department of Biomedical and Health Informatics, Children's Hospital of Philadelphia, Philadelphia, Pennsylvania, USA.

出版信息

J Am Med Inform Assoc. 2020 Apr 1;27(4):558-566. doi: 10.1093/jamia/ocaa005.

Abstract

OBJECTIVE

This study introduces a temporal condition pattern mining methodology to address the sparse nature of coded condition concept utilization in electronic health record data. As a validation study, we applied this method to reveal condition patterns surrounding an initial diagnosis of pediatric asthma.

MATERIALS AND METHODS

The SPADE (Sequential PAttern Discovery using Equivalence classes) algorithm was used to identify common temporal condition patterns surrounding the initial diagnosis of pediatric asthma in a study population of 71 824 patients from the Children's Hospital of Philadelphia. SPADE was applied to a dataset with diagnoses coded using International Classification of Diseases (ICD) concepts and separately to a dataset with the ICD codes mapped to their corresponding expanded diagnostic clusters (EDCs). Common temporal condition patterns surrounding the initial diagnosis of pediatric asthma ascertained by SPADE from both the ICD and EDC datasets were compared.

RESULTS

SPADE identified 36 unique diagnoses in the mapped EDC dataset, whereas only 19 were recognized in the ICD dataset. Temporal trends in condition diagnoses ascertained from the EDC data were not discoverable in the ICD dataset.

DISCUSSION

Mining frequent temporal condition patterns from large electronic health record datasets may reveal previously unknown associations between diagnoses that could inform future research into causation or other relationships. Mapping sparsely coded medical concepts into homogenous groups was essential to discovering potentially useful information from our dataset.

CONCLUSIONS

We expect that the presented methodology is applicable to the study of diagnostic trajectories for other clinical conditions and can be extended to study temporal patterns of other coded medical concepts such as medications and procedures.

摘要

目的

本研究引入了一种时间条件模式挖掘方法,以解决电子健康记录数据中编码条件概念利用的稀疏性问题。作为验证研究,我们应用该方法揭示了围绕儿童哮喘初始诊断的条件模式。

材料和方法

使用 SPADE(使用等价类的序列模式发现)算法在费城儿童医院的 71824 名患者的研究人群中,确定了儿童哮喘初始诊断周围常见的时间条件模式。SPADE 应用于使用国际疾病分类(ICD)概念编码的数据集,以及将 ICD 代码映射到其相应的扩展诊断集群(EDC)的数据集。比较 SPADE 从 ICD 和 EDC 数据集确定的围绕儿童哮喘初始诊断的常见时间条件模式。

结果

SPADE 在映射的 EDC 数据集中识别出 36 个独特的诊断,而在 ICD 数据集中仅识别出 19 个。从 EDC 数据中确定的条件诊断的时间趋势在 ICD 数据集中不可发现。

讨论

从大型电子健康记录数据集挖掘常见的时间条件模式可能会揭示以前未知的诊断之间的关联,这可以为未来的因果关系或其他关系研究提供信息。将稀疏编码的医学概念映射到同质组是从我们的数据集发现潜在有用信息的关键。

结论

我们预计所提出的方法适用于其他临床条件的诊断轨迹研究,并可扩展到研究其他编码医学概念(如药物和程序)的时间模式。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验