• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

利用人工智能从医生笔记中检测症状,推动生物监测超越编码数据:回顾性队列研究。

Moving Biosurveillance Beyond Coded Data Using AI for Symptom Detection From Physician Notes: Retrospective Cohort Study.

机构信息

Computational Health Informatics Program, Boston Children's Hospital, Boston, MA, United States.

Department of Pediatrics, Harvard Medical School, Boston, MA, United States.

出版信息

J Med Internet Res. 2024 Apr 4;26:e53367. doi: 10.2196/53367.

DOI:10.2196/53367
PMID:38573752
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11027052/
Abstract

BACKGROUND

Real-time surveillance of emerging infectious diseases necessitates a dynamically evolving, computable case definition, which frequently incorporates symptom-related criteria. For symptom detection, both population health monitoring platforms and research initiatives primarily depend on structured data extracted from electronic health records.

OBJECTIVE

This study sought to validate and test an artificial intelligence (AI)-based natural language processing (NLP) pipeline for detecting COVID-19 symptoms from physician notes in pediatric patients. We specifically study patients presenting to the emergency department (ED) who can be sentinel cases in an outbreak.

METHODS

Subjects in this retrospective cohort study are patients who are 21 years of age and younger, who presented to a pediatric ED at a large academic children's hospital between March 1, 2020, and May 31, 2022. The ED notes for all patients were processed with an NLP pipeline tuned to detect the mention of 11 COVID-19 symptoms based on Centers for Disease Control and Prevention (CDC) criteria. For a gold standard, 3 subject matter experts labeled 226 ED notes and had strong agreement (F-score=0.986; positive predictive value [PPV]=0.972; and sensitivity=1.0). F-score, PPV, and sensitivity were used to compare the performance of both NLP and the International Classification of Diseases, 10th Revision (ICD-10) coding to the gold standard chart review. As a formative use case, variations in symptom patterns were measured across SARS-CoV-2 variant eras.

RESULTS

There were 85,678 ED encounters during the study period, including 4% (n=3420) with patients with COVID-19. NLP was more accurate at identifying encounters with patients that had any of the COVID-19 symptoms (F-score=0.796) than ICD-10 codes (F-score =0.451). NLP accuracy was higher for positive symptoms (sensitivity=0.930) than ICD-10 (sensitivity=0.300). However, ICD-10 accuracy was higher for negative symptoms (specificity=0.994) than NLP (specificity=0.917). Congestion or runny nose showed the highest accuracy difference (NLP: F-score=0.828 and ICD-10: F-score=0.042). For encounters with patients with COVID-19, prevalence estimates of each NLP symptom differed across variant eras. Patients with COVID-19 were more likely to have each NLP symptom detected than patients without this disease. Effect sizes (odds ratios) varied across pandemic eras.

CONCLUSIONS

This study establishes the value of AI-based NLP as a highly effective tool for real-time COVID-19 symptom detection in pediatric patients, outperforming traditional ICD-10 methods. It also reveals the evolving nature of symptom prevalence across different virus variants, underscoring the need for dynamic, technology-driven approaches in infectious disease surveillance.

摘要

背景

实时监测新发传染病需要一个不断发展的、可计算的病例定义,该定义通常包含与症状相关的标准。对于症状检测,人群健康监测平台和研究计划主要依赖于从电子健康记录中提取的结构化数据。

目的

本研究旨在验证和测试一种基于人工智能(AI)的自然语言处理(NLP)管道,用于从儿科患者的医生笔记中检测 COVID-19 症状。我们特别研究了在急诊科(ED)就诊的患者,他们可能是疫情中的哨点病例。

方法

本回顾性队列研究的受试者为年龄在 21 岁及以下的患者,他们于 2020 年 3 月 1 日至 2022 年 5 月 31 日期间在一家大型学术儿童医院的儿科 ED 就诊。对所有患者的 ED 记录进行 NLP 管道处理,该管道经过调整可根据疾病控制与预防中心(CDC)标准检测 11 种 COVID-19 症状的提及。作为金标准,3 名主题专家对 226 份 ED 记录进行了标记,并且具有很强的一致性(F 分数=0.986;阳性预测值[PPV]=0.972;敏感性=1.0)。F 分数、PPV 和敏感性用于比较 NLP 和国际疾病分类,第 10 版(ICD-10)编码与金标准图表审查的性能。作为一个形成性用例,测量了 SARS-CoV-2 变异时期症状模式的变化。

结果

在研究期间,有 85678 次 ED 就诊,其中 4%(n=3420)的患者患有 COVID-19。与 ICD-10 编码(F 分数=0.451)相比,NLP 更能准确识别出任何 COVID-19 症状的患者(F 分数=0.796)。NLP 对阳性症状(敏感性=0.930)的识别准确性高于 ICD-10(敏感性=0.300)。然而,ICD-10 对阴性症状的准确性(特异性=0.994)高于 NLP(特异性=0.917)。鼻塞或流鼻涕的准确率差异最大(NLP:F 分数=0.828,ICD-10:F 分数=0.042)。对于 COVID-19 患者的就诊,每个 NLP 症状的患病率估计值在不同的变异时期有所不同。患有 COVID-19 的患者比没有这种疾病的患者更有可能被检测到每个 NLP 症状。在大流行时期,效果大小(比值比)有所不同。

结论

本研究确立了基于人工智能的 NLP 作为一种非常有效的实时 COVID-19 症状检测工具的价值,其性能优于传统的 ICD-10 方法。它还揭示了不同病毒变异体之间症状流行率的演变性质,强调了在传染病监测中需要采用动态、技术驱动的方法。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c050/11027052/fecbeb41b53c/jmir_v26i1e53367_fig3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c050/11027052/cd591a2dd904/jmir_v26i1e53367_fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c050/11027052/639e5e9737bf/jmir_v26i1e53367_fig2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c050/11027052/fecbeb41b53c/jmir_v26i1e53367_fig3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c050/11027052/cd591a2dd904/jmir_v26i1e53367_fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c050/11027052/639e5e9737bf/jmir_v26i1e53367_fig2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c050/11027052/fecbeb41b53c/jmir_v26i1e53367_fig3.jpg

相似文献

1
Moving Biosurveillance Beyond Coded Data Using AI for Symptom Detection From Physician Notes: Retrospective Cohort Study.利用人工智能从医生笔记中检测症状,推动生物监测超越编码数据:回顾性队列研究。
J Med Internet Res. 2024 Apr 4;26:e53367. doi: 10.2196/53367.
2
Signs and symptoms to determine if a patient presenting in primary care or hospital outpatient settings has COVID-19.在基层医疗机构或医院门诊环境中,如果患者出现以下症状和体征,可判断其是否患有 COVID-19。
Cochrane Database Syst Rev. 2022 May 20;5(5):CD013665. doi: 10.1002/14651858.CD013665.pub3.
3
Prescription of Controlled Substances: Benefits and Risks管制药品的处方:益处与风险
4
Antibody tests for identification of current and past infection with SARS-CoV-2.抗体检测用于鉴定 SARS-CoV-2 的现症感染和既往感染。
Cochrane Database Syst Rev. 2022 Nov 17;11(11):CD013652. doi: 10.1002/14651858.CD013652.pub2.
5
Rapid, point-of-care antigen tests for diagnosis of SARS-CoV-2 infection.用于 SARS-CoV-2 感染诊断的快速、即时抗原检测。
Cochrane Database Syst Rev. 2022 Jul 22;7(7):CD013705. doi: 10.1002/14651858.CD013705.pub3.
6
The effect of sample site and collection procedure on identification of SARS-CoV-2 infection.样本采集部位和采集程序对严重急性呼吸综合征冠状病毒2(SARS-CoV-2)感染鉴定的影响。
Cochrane Database Syst Rev. 2024 Dec 16;12(12):CD014780. doi: 10.1002/14651858.CD014780.
7
Large Language Model Symptom Identification From Clinical Text: Multicenter Study.基于临床文本的大语言模型症状识别:多中心研究。
J Med Internet Res. 2025 Jul 31;27:e72984. doi: 10.2196/72984.
8
Laboratory-based molecular test alternatives to RT-PCR for the diagnosis of SARS-CoV-2 infection.基于实验室的分子检测替代 RT-PCR 用于 SARS-CoV-2 感染的诊断。
Cochrane Database Syst Rev. 2024 Oct 14;10(10):CD015618. doi: 10.1002/14651858.CD015618.
9
Performance of Natural Language Processing versus International Classification of Diseases Codes in Building Registries for Patients With Fall Injury: Retrospective Analysis.自然语言处理与国际疾病分类编码在构建跌倒损伤患者登记册中的性能:回顾性分析
JMIR Med Inform. 2025 Jul 14;13:e66973. doi: 10.2196/66973.
10
Thoracic imaging tests for the diagnosis of COVID-19.用于 COVID-19 诊断的胸部影像学检查。
Cochrane Database Syst Rev. 2022 May 16;5(5):CD013639. doi: 10.1002/14651858.CD013639.pub5.

引用本文的文献

1
Large Language Model Symptom Identification From Clinical Text: Multicenter Study.基于临床文本的大语言模型症状识别:多中心研究。
J Med Internet Res. 2025 Jul 31;27:e72984. doi: 10.2196/72984.
2
Zero-Shot Extraction of Seizure Outcomes from Clinical Notes Using Generative Pretrained Transformers.使用生成式预训练变换器从临床记录中进行癫痫发作结果的零样本提取。
J Healthc Inform Res. 2025 Apr 29;9(3):380-400. doi: 10.1007/s41666-025-00198-5. eCollection 2025 Sep.
3
Characteristics of influenza, SARS-CoV-2, and RSV surveillance systems that utilise ICD-coded data: a systematic review.

本文引用的文献

1
Extracting symptoms from free-text responses using ChatGPT among COVID-19 cases in Hong Kong.利用 ChatGPT 从香港 COVID-19 病例的自由文本回复中提取症状。
Clin Microbiol Infect. 2024 Jan;30(1):142.e1-142.e3. doi: 10.1016/j.cmi.2023.11.002. Epub 2023 Nov 8.
2
Clinical features of COVID-19 in Italian outpatient children and adolescents during Parental, Delta, and Omicron waves: a prospective, observational, cohort study.意大利门诊儿童和青少年在亲本、德尔塔和奥密克戎毒株流行期间感染新冠病毒的临床特征:一项前瞻性观察队列研究
Front Pediatr. 2023 Aug 10;11:1193857. doi: 10.3389/fped.2023.1193857. eCollection 2023.
3
利用国际疾病分类编码数据的流感、新冠病毒和呼吸道合胞病毒监测系统的特征:一项系统综述
J Glob Health. 2025 May 23;15:04177. doi: 10.7189/jogh.15.04177.
4
Mapping artificial intelligence models in emergency medicine: A scoping review on artificial intelligence performance in emergency care and education.绘制急诊医学中的人工智能模型:关于人工智能在急诊护理和教育中表现的范围综述。
Turk J Emerg Med. 2025 Apr 1;25(2):67-91. doi: 10.4103/tjem.tjem_45_25. eCollection 2025 Apr-Jun.
5
Cumulus: a federated electronic health record-based learning system powered by Fast Healthcare Interoperability Resources and artificial intelligence.Cumulus:一个基于联邦电子健康记录的学习系统,由 Fast Healthcare Interoperability Resources 和人工智能提供支持。
J Am Med Inform Assoc. 2024 Aug 1;31(8):1638-1647. doi: 10.1093/jamia/ocae130.
6
Cumulus: A federated EHR-based learning system powered by FHIR and AI.积云:一个由FHIR和人工智能驱动的基于联合电子健康记录的学习系统。
medRxiv. 2024 Feb 6:2024.02.02.24301940. doi: 10.1101/2024.02.02.24301940.
A computable case definition for patients with SARS-CoV2 testing that occurred outside the hospital.
针对在医院外进行严重急性呼吸综合征冠状病毒2(SARS-CoV-2)检测的患者的可计算病例定义。
JAMIA Open. 2023 Jul 5;6(3):ooad047. doi: 10.1093/jamiaopen/ooad047. eCollection 2023 Oct.
4
Comparison of Symptoms Associated With SARS-CoV-2 Variants Among Children in Canada.加拿大儿童中与 SARS-CoV-2 变异株相关症状的比较。
JAMA Netw Open. 2023 Mar 1;6(3):e232328. doi: 10.1001/jamanetworkopen.2023.2328.
5
Decreased Clinical Severity of Pediatric Acute COVID-19 and MIS-C and Increase of Incidental Cases during the Omicron Wave in Comparison to the Delta Wave.与德尔塔变异株流行相比,奥密克戎变异株流行期间儿童急性 COVID-19 和 MIS-C 的临床严重程度降低,偶发病例增加。
Viruses. 2023 Jan 7;15(1):180. doi: 10.3390/v15010180.
6
Natural Language Processing for Improved Characterization of COVID-19 Symptoms: Observational Study of 350,000 Patients in a Large Integrated Health Care System.自然语言处理改善 COVID-19 症状特征描述:大型综合医疗保健系统中 35 万名患者的观察性研究。
JMIR Public Health Surveill. 2022 Dec 30;8(12):e41529. doi: 10.2196/41529.
7
Epidemiology and clinical features of SARS-CoV-2 infection in hospitalized children across four waves in Hungary: A retrospective, comparative study from March 2020 to December 2021.匈牙利四波疫情期间住院儿童感染新型冠状病毒的流行病学及临床特征:一项2020年3月至2021年12月的回顾性比较研究
Health Sci Rep. 2022 Nov 21;5(6):e937. doi: 10.1002/hsr2.937. eCollection 2022 Nov.
8
COVID-19-Related Symptoms during the SARS-CoV-2 Omicron (B.1.1.529) Variant Surge in Japan.日本奥密克戎(B.1.1.529)变异株流行期间的 COVID-19 相关症状。
Tohoku J Exp Med. 2022 Sep 6;258(2):103-110. doi: 10.1620/tjem.2022.J067. Epub 2022 Aug 25.
9
Clinical characteristics of COVID-19 in hospitalized children during the Omicron variant predominant period.奥密克戎变异株流行期间住院儿童 COVID-19 的临床特征。
J Infect Chemother. 2022 Nov;28(11):1531-1535. doi: 10.1016/j.jiac.2022.08.004. Epub 2022 Aug 10.
10
Symptoms and risk factors for long COVID in non-hospitalized adults.非住院成年人的长新冠症状和风险因素。
Nat Med. 2022 Aug;28(8):1706-1714. doi: 10.1038/s41591-022-01909-w. Epub 2022 Jul 25.