Suppr超能文献

应用大语言模型评估医疗质量:监测注意力缺陷多动障碍药物的副作用

Applying Large Language Models to Assess Quality of Care: Monitoring ADHD Medication Side Effects.

作者信息

Bannett Yair, Gunturkun Fatma, Pillai Malvika, Herrmann Jessica E, Luo Ingrid, Huffman Lynne C, Feldman Heidi M

机构信息

Division of Developmental-Behavioral Pediatrics, Stanford University School of Medicine, Stanford, California.

Stanford Quantitative Sciences Unit, Stanford, California.

出版信息

Pediatrics. 2025 Jan 1;155(1). doi: 10.1542/peds.2024-067223.

Abstract

OBJECTIVE

To assess the accuracy of a large language model (LLM) in measuring clinician adherence to practice guidelines for monitoring side effects after prescribing medications for children with attention-deficit/hyperactivity disorder (ADHD).

METHODS

Retrospective population-based cohort study of electronic health records. Cohort included children aged 6 to 11 years with ADHD diagnosis and 2 or more ADHD medication encounters (stimulants or nonstimulants prescribed) between 2015 and 2022 in a community-based primary health care network (n = 1201). To identify documentation of side effects inquiry, we trained, tested, and deployed an open-source LLM (LLaMA) on all clinical notes from ADHD-related encounters (ADHD diagnosis or ADHD medication prescription), including in-clinic/telehealth and telephone encounters (n = 15 628 notes). Model performance was assessed using holdout and deployment test sets, compared with manual medical record review.

RESULTS

The LLaMA model accurately classified notes that contained side effects inquiry (sensitivity = 87.2, specificity = 86.3, area under curve = 0.93 on holdout test set). Analyses revealed no model bias in relation to patient sex or insurance. Mean age (SD) at first prescription was 8.8 (1.6) years; characteristics were mostly similar across patients with and without documented side effects inquiry. Rates of documented side effects inquiry were lower for telephone encounters than for in-clinic/telehealth encounters (51.9% vs 73.0%, P < .001). Side effects inquiry was documented in 61.4% of encounters after stimulant prescriptions and 48.5% of encounters after nonstimulant prescriptions (P = .041).

CONCLUSIONS

Deploying an LLM on a variable set of clinical notes, including telephone notes, offered scalable measurement of quality of care and uncovered opportunities to improve psychopharmacological medication management in primary care.

摘要

目的

评估大语言模型(LLM)在衡量临床医生对注意缺陷多动障碍(ADHD)患儿用药后副作用监测实践指南的遵循情况方面的准确性。

方法

基于电子健康记录的回顾性人群队列研究。队列包括2015年至2022年期间在社区初级卫生保健网络中诊断为ADHD且有2次或更多次ADHD药物治疗经历(开具兴奋剂或非兴奋剂)的6至11岁儿童(n = 1201)。为了识别副作用询问的记录,我们在与ADHD相关的诊疗经历(ADHD诊断或ADHD药物处方)的所有临床记录上训练、测试并部署了一个开源大语言模型(LLaMA),包括门诊/远程医疗和电话诊疗经历(n = 15628条记录)。与人工病历审查相比,使用留出法和部署测试集评估模型性能。

结果

LLaMA模型准确地对包含副作用询问的记录进行了分类(在留出测试集上,灵敏度 = 87.2,特异性 = 86.3,曲线下面积 = 0.93)。分析显示,该模型在患者性别或保险方面没有偏差。首次处方时的平均年龄(标准差)为8.8(1.6)岁;有和没有记录副作用询问的患者的特征大多相似。电话诊疗经历中记录副作用询问的比例低于门诊/远程医疗诊疗经历(51.9%对73.0%,P < 0.001)。兴奋剂处方后61.4%的诊疗经历和非兴奋剂处方后48.5%的诊疗经历记录了副作用询问(P = 0.041)。

结论

在包括电话记录在内的一系列临床记录上部署大语言模型,为可扩展的医疗质量测量提供了可能,并发现了改善初级保健中精神药物管理的机会。

相似文献

本文引用的文献

4
Continuity of Care in Primary Care for Young Children With Chronic Conditions.慢性病幼儿初级保健中的连续性护理。
Acad Pediatr. 2023 Mar;23(2):314-321. doi: 10.1016/j.acap.2022.07.012. Epub 2022 Jul 17.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验