利用人工智能从医生笔记中检测症状，推动生物监测超越编码数据：回顾性队列研究。

Moving Biosurveillance Beyond Coded Data Using AI for Symptom Detection From Physician Notes: Retrospective Cohort Study.

机构信息

Computational Health Informatics Program, Boston Children's Hospital, Boston, MA, United States.

Department of Pediatrics, Harvard Medical School, Boston, MA, United States.

出版信息

J Med Internet Res. 2024 Apr 4;26:e53367. doi: 10.2196/53367.

DOI:10.2196/53367

PMID:38573752

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11027052/

Abstract

BACKGROUND

Real-time surveillance of emerging infectious diseases necessitates a dynamically evolving, computable case definition, which frequently incorporates symptom-related criteria. For symptom detection, both population health monitoring platforms and research initiatives primarily depend on structured data extracted from electronic health records.

OBJECTIVE

This study sought to validate and test an artificial intelligence (AI)-based natural language processing (NLP) pipeline for detecting COVID-19 symptoms from physician notes in pediatric patients. We specifically study patients presenting to the emergency department (ED) who can be sentinel cases in an outbreak.

METHODS

Subjects in this retrospective cohort study are patients who are 21 years of age and younger, who presented to a pediatric ED at a large academic children's hospital between March 1, 2020, and May 31, 2022. The ED notes for all patients were processed with an NLP pipeline tuned to detect the mention of 11 COVID-19 symptoms based on Centers for Disease Control and Prevention (CDC) criteria. For a gold standard, 3 subject matter experts labeled 226 ED notes and had strong agreement (F-score=0.986; positive predictive value [PPV]=0.972; and sensitivity=1.0). F-score, PPV, and sensitivity were used to compare the performance of both NLP and the International Classification of Diseases, 10th Revision (ICD-10) coding to the gold standard chart review. As a formative use case, variations in symptom patterns were measured across SARS-CoV-2 variant eras.

RESULTS

There were 85,678 ED encounters during the study period, including 4% (n=3420) with patients with COVID-19. NLP was more accurate at identifying encounters with patients that had any of the COVID-19 symptoms (F-score=0.796) than ICD-10 codes (F-score =0.451). NLP accuracy was higher for positive symptoms (sensitivity=0.930) than ICD-10 (sensitivity=0.300). However, ICD-10 accuracy was higher for negative symptoms (specificity=0.994) than NLP (specificity=0.917). Congestion or runny nose showed the highest accuracy difference (NLP: F-score=0.828 and ICD-10: F-score=0.042). For encounters with patients with COVID-19, prevalence estimates of each NLP symptom differed across variant eras. Patients with COVID-19 were more likely to have each NLP symptom detected than patients without this disease. Effect sizes (odds ratios) varied across pandemic eras.

CONCLUSIONS

This study establishes the value of AI-based NLP as a highly effective tool for real-time COVID-19 symptom detection in pediatric patients, outperforming traditional ICD-10 methods. It also reveals the evolving nature of symptom prevalence across different virus variants, underscoring the need for dynamic, technology-driven approaches in infectious disease surveillance.

摘要

背景

实时监测新发传染病需要一个不断发展的、可计算的病例定义，该定义通常包含与症状相关的标准。对于症状检测，人群健康监测平台和研究计划主要依赖于从电子健康记录中提取的结构化数据。

目的

本研究旨在验证和测试一种基于人工智能（AI）的自然语言处理（NLP）管道，用于从儿科患者的医生笔记中检测 COVID-19 症状。我们特别研究了在急诊科（ED）就诊的患者，他们可能是疫情中的哨点病例。

方法

本回顾性队列研究的受试者为年龄在 21 岁及以下的患者，他们于 2020 年 3 月 1 日至 2022 年 5 月 31 日期间在一家大型学术儿童医院的儿科 ED 就诊。对所有患者的 ED 记录进行 NLP 管道处理，该管道经过调整可根据疾病控制与预防中心（CDC）标准检测 11 种 COVID-19 症状的提及。作为金标准，3 名主题专家对 226 份 ED 记录进行了标记，并且具有很强的一致性（F 分数=0.986；阳性预测值[PPV]=0.972；敏感性=1.0）。F 分数、PPV 和敏感性用于比较 NLP 和国际疾病分类，第 10 版（ICD-10）编码与金标准图表审查的性能。作为一个形成性用例，测量了 SARS-CoV-2 变异时期症状模式的变化。

结果

在研究期间，有 85678 次 ED 就诊，其中 4%（n=3420）的患者患有 COVID-19。与 ICD-10 编码（F 分数=0.451）相比，NLP 更能准确识别出任何 COVID-19 症状的患者（F 分数=0.796）。NLP 对阳性症状（敏感性=0.930）的识别准确性高于 ICD-10（敏感性=0.300）。然而，ICD-10 对阴性症状的准确性（特异性=0.994）高于 NLP（特异性=0.917）。鼻塞或流鼻涕的准确率差异最大（NLP：F 分数=0.828，ICD-10：F 分数=0.042）。对于 COVID-19 患者的就诊，每个 NLP 症状的患病率估计值在不同的变异时期有所不同。患有 COVID-19 的患者比没有这种疾病的患者更有可能被检测到每个 NLP 症状。在大流行时期，效果大小（比值比）有所不同。