Suppr超能文献

基于电子健康记录的自动失眠症表型分析:利用大语言模型解读临床叙述。

Automated Insomnia Phenotyping from Electronic Health Records: Leveraging Large Language Models to Decode Clinical Narratives.

作者信息

Lopez-Garcia Guillermo, Weissenbacher Davy, Stadler Matthew, O'Connor Karen, Xu Dongfang, Gryboski Lauren, Heavens Jared, Abu-El-Rub Noor, Mazzotti Diego R, Chakravorty Subhajit, Gonzalez-Hernandez Graciela

机构信息

Department of Computational Biomedicine, Cedars-Sinai Medical Center, West Hollywood, CA.

Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA.

出版信息

medRxiv. 2025 Jun 3:2025.06.02.25328701. doi: 10.1101/2025.06.02.25328701.

Abstract

Insomnia is a highly prevalent but often underdiagnosed condition in clinical practice. Its inconsistent documentation in electronic health records (EHRs) limits population-level analyses and obstructs efforts to evaluate treatment patterns or outcomes. We present a novel, fully automated approach for phenotyping insomnia directly from unstructured clinical notes using generative large language models (LLMs). Leveraging prompt engineering with few-shot learning and chain-of-thought reasoning, we evaluated our system on two distinct corpora: inpatient clinical notes from MIMIC-III and outpatient primary care notes from the University of Kansas Health System (KUMC). Our models-Llama 70B and Llama 405B-achieved F1 scores of 93.0 on the MIMIC corpus and 85.7 on the KUMC corpus, substantially outperforming domain-adapted BERT-based classifiers. Ultimately, our framework offers a scalable and interpretable solution for clinical phenotyping of insomnia and can serve as a blueprint for similar efforts targeting other underdiagnosed or under-documented conditions in the EHR.

摘要

失眠是一种在临床实践中非常普遍但常常被漏诊的病症。其在电子健康记录(EHRs)中的记录不一致,限制了对人群层面的分析,并阻碍了评估治疗模式或结果的努力。我们提出了一种新颖的、完全自动化的方法,使用生成式大语言模型(LLMs)直接从非结构化临床记录中对失眠进行表型分析。通过利用少样本学习和思维链推理的提示工程,我们在两个不同的语料库上评估了我们的系统:来自MIMIC-III的住院临床记录和来自堪萨斯大学健康系统(KUMC)的门诊初级保健记录。我们的模型——Llama 70B和Llama 405B——在MIMIC语料库上的F1分数为93.0,在KUMC语料库上为85.7,显著优于基于领域适应的BERT分类器。最终,我们的框架为失眠的临床表型分析提供了一种可扩展且可解释的解决方案,并可作为针对EHR中其他未充分诊断或记录不足病症的类似努力的蓝图。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2559/12155036/b71aac286d93/nihpp-2025.06.02.25328701v1-f0001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验