Pakhomov Serguei S V, Hemingway Harry, Weston Susan A, Jacobsen Steven J, Rodeheffer Richard, Roger Véronique L
Department of Health Sciences Research, Mayo Clinic College of Medicine, Rochester, MN 55905, USA.
Am Heart J. 2007 Apr;153(4):666-73. doi: 10.1016/j.ahj.2006.12.022.
The diagnosis of angina is challenging because it relies on symptom descriptions. Natural language processing (NLP) of the electronic medical record (EMR) can provide access to such information contained in free text that may not be fully captured by conventional diagnostic coding.
To test the hypothesis that NLP of the EMR improves angina pectoris ascertainment over diagnostic codes.
Billing records of inpatients and outpatients were searched for International Classification of Diseases, Ninth Revision (ICD-9) codes for angina pectoris, chronic ischemic heart disease, and chest pain. EMR clinical reports were searched electronically for 50 specific nonnegated natural language synonyms to these ICD-9 codes. The 2 methods were compared to a standardized assessment of angina by Rose questionnaire for 3 diagnostic levels: unspecified chest pain, exertional chest pain, and Rose angina.
Compared with the Rose questionnaire, the true-positive rate of EMR-NLP for unspecified chest pain was 62% (95% CI 55-67) versus 51% (95% CI 44-58) for diagnostic codes (P < .001). For exertional chest pain, the EMR-NLP true-positive rate was 71% (95% CI 61-80) versus 62% (95% CI 52-73) for diagnostic codes (P = .10). Both approaches had 88% (95% CI 65-100) true-positive rate for Rose angina. The EMR-NLP method consistently identified more patients with exertional chest pain over a 28-month follow-up.
EMR-NLP method improves the detection of unspecified and exertional chest pain cases compared to diagnostic codes. These findings have implications for epidemiological and clinical studies of angina pectoris.
心绞痛的诊断具有挑战性,因为它依赖于症状描述。电子病历(EMR)的自然语言处理(NLP)可以获取自由文本中包含的此类信息,而这些信息可能无法通过传统诊断编码完全捕捉。
检验EMR的NLP比诊断编码能更好地确定心绞痛这一假设。
在住院患者和门诊患者的计费记录中搜索国际疾病分类第九版(ICD-9)中关于心绞痛、慢性缺血性心脏病和胸痛的编码。在EMR临床报告中电子搜索与这些ICD-9编码对应的50个特定的非否定自然语言同义词。将这两种方法与通过罗斯问卷对心绞痛进行的标准化评估进行比较,分为三个诊断水平:未明确的胸痛、劳力性胸痛和罗斯心绞痛。
与罗斯问卷相比,EMR-NLP对未明确胸痛的真阳性率为62%(95%CI 55-67),而诊断编码的真阳性率为51%(95%CI 44-58)(P <.001)。对于劳力性胸痛,EMR-NLP的真阳性率为71%(95%CI 61-80),诊断编码的真阳性率为62%(95%CI 52-73)(P =.10)。两种方法对罗斯心绞痛的真阳性率均为88%(95%CI 65-100)。在28个月的随访中,EMR-NLP方法始终能识别出更多劳力性胸痛患者。
与诊断编码相比,EMR-NLP方法能更好地检测未明确和劳力性胸痛病例。这些发现对心绞痛的流行病学和临床研究具有重要意义。