电子健康记录的自动表型分析：PheVis算法。

Automatic phenotyping of electronical health record: PheVis algorithm.

作者信息

Ferté Thomas, Cossin Sébastien, Schaeverbeke Thierry, Barnetche Thomas, Jouhet Vianney, Hejblum Boris P

机构信息

Bordeaux Hospital University Center, Pôle de santé publique, Service d'information médicale, Unité Informatique et Archivistique Médicales, F-33000 Bordeaux, France; Univ. Bordeaux ISPED, Inserm Bordeaux Population Health Research Center UMR 1219, Inria BSO, team SISTM, F-33000 Bordeaux, France.

Bordeaux Hospital University Center, Pôle de santé publique, Service d'information médicale, Unité Informatique et Archivistique Médicales, F-33000 Bordeaux, France; Univ. Bordeaux, Inserm, Bordeaux Population Health Research Center, team ERIAS, UMR 1219, F-33000 Bordeaux, France.

出版信息

J Biomed Inform. 2021 May;117:103746. doi: 10.1016/j.jbi.2021.103746. Epub 2021 Mar 19.

DOI:10.1016/j.jbi.2021.103746

PMID:33746080

Abstract

Electronic Health Records (EHRs) often lack reliable annotation of patient medical conditions. Phenorm, an automated unsupervised algorithm to identify patient medical conditions from EHR data, has been developed. PheVis extends PheNorm at the visit resolution. PheVis combines diagnosis codes together with medical concepts extracted from medical notes, incorporating past history in a machine learning approach to provide an interpretable parametric predictor of the occurrence probability for a given medical condition at each visit. PheVis is applied to two real-world use-cases using the datawarehouse of the University Hospital of Bordeaux: i) rheumatoid arthritis, a chronic condition; ii) tuberculosis, an acute condition. Cross-validated AUROC were respectively 0.943 [0.940; 0.945] and 0.987 [0.983; 0.990]. Cross-validated AUPRC were respectively 0.754 [0.744; 0.763] and 0.299 [0.198; 0.403]. PheVis performs well for chronic conditions, though absence of exclusion of past medical history by natural language processing tools limits its performance in French for acute conditions. It achieves significantly better performance than state-of-the-art unsupervised methods especially for chronic diseases.

摘要

电子健康记录（EHRs）往往缺乏对患者医疗状况的可靠注释。已经开发出一种名为PheNorm的自动无监督算法，用于从EHR数据中识别患者的医疗状况。PheVis在就诊分辨率方面扩展了PheNorm。PheVis将诊断代码与从医疗记录中提取的医学概念相结合，采用机器学习方法纳入既往病史，以提供每次就诊时给定医疗状况发生概率的可解释参数预测器。PheVis使用波尔多大学医院的数据仓库应用于两个实际用例：i）类风湿性关节炎，一种慢性病；ii）结核病，一种急性病。交叉验证的AUROC分别为0.943 [0.940; 0.945]和0.987 [0.983; 0.990]。交叉验证的AUPRC分别为0.754 [0.744; 0.763]和0.299 [0.198; 0.403]。PheVis在慢性病方面表现良好，不过自然语言处理工具未排除既往病史限制了其在法语急性病方面的性能。它比最先进的无监督方法取得了显著更好的性能，尤其是在慢性病方面。