Garcelon Nicolas, Neuraz Antoine, Benoit Vincent, Salomon Rémi, Burgun Anita
Institut Imagine, Paris Descartes Université Paris Descartes-Sorbonne Paris Cité, Paris, France.
INSERM, Centre de Recherche des Cordeliers, UMR 1138 Equipe 22, Université Paris Descartes, Sorbonne Paris Cité, Paris, France.
J Am Med Inform Assoc. 2017 May 1;24(3):607-613. doi: 10.1093/jamia/ocw144.
The repurposing of electronic health records (EHRs) can improve clinical and genetic research for rare diseases. However, significant information in rare disease EHRs is embedded in the narrative reports, which contain many negated clinical signs and family medical history. This paper presents a method to detect family history and negation in narrative reports and evaluates its impact on selecting populations from a clinical data warehouse (CDW).
We developed a pipeline to process 1.6 million reports from multiple sources. This pipeline is part of the load process of the Necker Hospital CDW.
We identified patients with "Lupus and diarrhea," "Crohn's and diabetes," and "NPHP1" from the CDW. The overall precision, recall, specificity, and F-measure were 0.85, 0.98, 0.93, and 0.91, respectively.
The proposed method generates a highly accurate identification of cases from a CDW of rare disease EHRs.
重新利用电子健康记录(EHR)可改善罕见病的临床和基因研究。然而,罕见病EHR中的重要信息嵌入在叙述性报告中,这些报告包含许多否定的临床体征和家族病史。本文提出了一种在叙述性报告中检测家族病史和否定信息的方法,并评估其对从临床数据仓库(CDW)中选择人群的影响。
我们开发了一个管道来处理来自多个来源的160万份报告。该管道是内克尔医院CDW加载过程的一部分。
我们从CDW中识别出患有“狼疮和腹泻”“克罗恩病和糖尿病”以及“NPHP1”的患者。总体精度、召回率、特异性和F值分别为0.85、0.98、0.93和0.91。
所提出的方法能从罕见病EHR的CDW中高度准确地识别病例。