Suppr超能文献

在大型稀疏电子健康记录数据中进行时态条件模式挖掘:以小儿哮喘特征描述为例的研究

Temporal condition pattern mining in large, sparse electronic health record data: A case study in characterizing pediatric asthma.

机构信息

Department of Information Science, College of Computing & Informatics, Drexel University, Philadelphia, Pennsylvania, USA.

Department of Biomedical and Health Informatics, Children's Hospital of Philadelphia, Philadelphia, Pennsylvania, USA.

出版信息

J Am Med Inform Assoc. 2020 Apr 1;27(4):558-566. doi: 10.1093/jamia/ocaa005.

Abstract

OBJECTIVE

This study introduces a temporal condition pattern mining methodology to address the sparse nature of coded condition concept utilization in electronic health record data. As a validation study, we applied this method to reveal condition patterns surrounding an initial diagnosis of pediatric asthma.

MATERIALS AND METHODS

The SPADE (Sequential PAttern Discovery using Equivalence classes) algorithm was used to identify common temporal condition patterns surrounding the initial diagnosis of pediatric asthma in a study population of 71 824 patients from the Children's Hospital of Philadelphia. SPADE was applied to a dataset with diagnoses coded using International Classification of Diseases (ICD) concepts and separately to a dataset with the ICD codes mapped to their corresponding expanded diagnostic clusters (EDCs). Common temporal condition patterns surrounding the initial diagnosis of pediatric asthma ascertained by SPADE from both the ICD and EDC datasets were compared.

RESULTS

SPADE identified 36 unique diagnoses in the mapped EDC dataset, whereas only 19 were recognized in the ICD dataset. Temporal trends in condition diagnoses ascertained from the EDC data were not discoverable in the ICD dataset.

DISCUSSION

Mining frequent temporal condition patterns from large electronic health record datasets may reveal previously unknown associations between diagnoses that could inform future research into causation or other relationships. Mapping sparsely coded medical concepts into homogenous groups was essential to discovering potentially useful information from our dataset.

CONCLUSIONS

We expect that the presented methodology is applicable to the study of diagnostic trajectories for other clinical conditions and can be extended to study temporal patterns of other coded medical concepts such as medications and procedures.

摘要

目的

本研究引入了一种时间条件模式挖掘方法,以解决电子健康记录数据中编码条件概念利用的稀疏性问题。作为验证研究,我们应用该方法揭示了围绕儿童哮喘初始诊断的条件模式。

材料和方法

使用 SPADE(使用等价类的序列模式发现)算法在费城儿童医院的 71824 名患者的研究人群中,确定了儿童哮喘初始诊断周围常见的时间条件模式。SPADE 应用于使用国际疾病分类(ICD)概念编码的数据集,以及将 ICD 代码映射到其相应的扩展诊断集群(EDC)的数据集。比较 SPADE 从 ICD 和 EDC 数据集确定的围绕儿童哮喘初始诊断的常见时间条件模式。

结果

SPADE 在映射的 EDC 数据集中识别出 36 个独特的诊断,而在 ICD 数据集中仅识别出 19 个。从 EDC 数据中确定的条件诊断的时间趋势在 ICD 数据集中不可发现。

讨论

从大型电子健康记录数据集挖掘常见的时间条件模式可能会揭示以前未知的诊断之间的关联,这可以为未来的因果关系或其他关系研究提供信息。将稀疏编码的医学概念映射到同质组是从我们的数据集发现潜在有用信息的关键。

结论

我们预计所提出的方法适用于其他临床条件的诊断轨迹研究,并可扩展到研究其他编码医学概念(如药物和程序)的时间模式。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验