Health Management Center, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, China.
School of Economics and Management, Institute of Systems Engineering, Dalian University of Technology, Dalian, China.
Front Public Health. 2022 Jan 20;9:793801. doi: 10.3389/fpubh.2021.793801. eCollection 2021.
The reasonable classification of a large number of distinct diagnosis codes can clarify patient diagnostic information and help clinicians to improve their ability to assign and target treatment for primary diseases. Our objective is to identify and predict a unifying diagnosis (UD) from electronic medical records (EMRs).
We screened 4,418 sepsis patients from a public MIMIC-III database and extracted their diagnostic information for UD identification, their demographic information, laboratory examination information, chief complaint, and history of present illness information for UD prediction. We proposed a data-driven UD identification and prediction method (UDIPM) embedding the disease ontology structure. First, we designed a set similarity measure method embedding the disease ontology structure to generate a patient similarity matrix. Second, we applied affinity propagation clustering to divide patients into different clusters, and extracted a typical diagnosis code co-occurrence pattern from each cluster. Furthermore, we identified a UD by fusing visual analysis and a conditional co-occurrence matrix. Finally, we trained five classifiers in combination with feature fusion and feature selection method to unify the diagnosis prediction.
The experimental results on a public electronic medical record dataset showed that the UDIPM could extracted a typical diagnosis code co-occurrence pattern effectively, identified and predicted a UD based on patients' diagnostic and admission information, and outperformed other fusion methods overall.
The accurate identification and prediction of the UD from a large number of distinct diagnosis codes and multi-source heterogeneous patient admission information in EMRs can provide a data-driven approach to assist better coding integration of diagnosis.
对大量不同的诊断代码进行合理分类,可以阐明患者的诊断信息,帮助临床医生提高对主要疾病进行分类和治疗的能力。我们的目标是从电子病历(EMR)中识别和预测统一诊断(UD)。
我们从公共的 MIMIC-III 数据库中筛选了 4418 例脓毒症患者,并提取了他们的诊断信息,用于 UD 识别、人口统计学信息、实验室检查信息、主要诉求和现病史信息,用于 UD 预测。我们提出了一种基于数据驱动的 UD 识别和预测方法(UDIPM),嵌入了疾病本体结构。首先,我们设计了一种带有疾病本体结构的集合相似度测量方法,生成患者相似度矩阵。其次,我们应用亲和传播聚类将患者分为不同的簇,并从每个簇中提取典型的诊断代码共现模式。此外,我们通过融合视觉分析和条件共现矩阵来识别 UD。最后,我们结合特征融合和特征选择方法训练了五个分类器来统一诊断预测。
在公共电子病历数据集上的实验结果表明,UDIPM 可以有效地提取典型的诊断代码共现模式,根据患者的诊断和入院信息识别和预测 UD,并且总体上优于其他融合方法。
从 EMR 中大量不同的诊断代码和多源异构的患者入院信息中准确识别和预测 UD,可以提供一种数据驱动的方法来辅助更好地整合诊断编码。