• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

一种利用临床记录来识别医生疲劳的机器学习模型。

A machine learning model using clinical notes to identify physician fatigue.

作者信息

Hsu Chao-Chun, Obermeyer Ziad, Tan Chenhao

机构信息

University of Chicago, Chicago, IL, USA.

University of California, Berkeley, CA, USA.

出版信息

Nat Commun. 2025 Jul 1;16(1):5791. doi: 10.1038/s41467-025-60865-4.

DOI:10.1038/s41467-025-60865-4
PMID:40592818
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12217598/
Abstract

Clinical notes should capture important information from a physician-patient encounter, but they may also contain signals indicative of physician fatigue. Using data from 129,228 emergency department (ED) visits, we train a model to identify notes written by physicians who are likely to be tired: those who worked ED shifts on at least 5 of the prior 7 days. In a hold-out set, the model accurately identifies notes written by such high-workload physicians. It also flags notes written in other settings with high fatigue: overnight shifts and high patient volumes. When the model identifies signs of fatigue in a note, physician decision-making for that patient appears worse: yield of testing for heart attack is 19% lower with each standard deviation increase in model-predicted fatigue. A key feature of notes written by fatigued doctors is the predictability of the next word, given the preceding context. Perhaps unsurprisingly, because word prediction is the core of how large language models (LLMs) work, we find that predicted fatigue of LLM-written notes is 74% higher than that of physician-written ones, highlighting the possibility that LLMs may introduce distortions in generated text that are not yet fully understood.

摘要

临床记录应包含医患诊疗过程中的重要信息,但也可能包含医生疲劳的迹象。利用129228次急诊科就诊的数据,我们训练了一个模型,以识别可能疲劳的医生所写的记录:即那些在前7天中至少有5天在急诊科轮班工作的医生。在一个保留集中,该模型能够准确识别此类高工作量医生所写的记录。它还能标记出在其他高疲劳环境下所写的记录:夜班和高患者量情况。当模型在一份记录中识别出疲劳迹象时,该患者的医生决策似乎更差:随着模型预测的疲劳程度每增加一个标准差,心脏病发作检测的阳性率就会降低19%。疲劳医生所写记录的一个关键特征是,根据前文语境,下一个单词具有可预测性。也许不足为奇的是,由于单词预测是大语言模型(LLMs)工作的核心,我们发现大语言模型生成的记录的预测疲劳程度比医生手写的记录高74%,这凸显了大语言模型可能在生成文本中引入尚未完全理解的偏差的可能性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9f64/12217598/8ca9b278ebb3/41467_2025_60865_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9f64/12217598/2b4ea1a4893b/41467_2025_60865_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9f64/12217598/11cfae9f3520/41467_2025_60865_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9f64/12217598/baf7c69e708c/41467_2025_60865_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9f64/12217598/8ca9b278ebb3/41467_2025_60865_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9f64/12217598/2b4ea1a4893b/41467_2025_60865_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9f64/12217598/11cfae9f3520/41467_2025_60865_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9f64/12217598/baf7c69e708c/41467_2025_60865_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9f64/12217598/8ca9b278ebb3/41467_2025_60865_Fig4_HTML.jpg

相似文献

1
A machine learning model using clinical notes to identify physician fatigue.一种利用临床记录来识别医生疲劳的机器学习模型。
Nat Commun. 2025 Jul 1;16(1):5791. doi: 10.1038/s41467-025-60865-4.
2
Signs and symptoms to determine if a patient presenting in primary care or hospital outpatient settings has COVID-19.在基层医疗机构或医院门诊环境中,如果患者出现以下症状和体征,可判断其是否患有 COVID-19。
Cochrane Database Syst Rev. 2022 May 20;5(5):CD013665. doi: 10.1002/14651858.CD013665.pub3.
3
The potential of Generative Pre-trained Transformer 4 (GPT-4) to analyse medical notes in three different languages: a retrospective model-evaluation study.生成式预训练变换器4(GPT-4)分析三种不同语言医学笔记的潜力:一项回顾性模型评估研究。
Lancet Digit Health. 2025 Jan;7(1):e35-e43. doi: 10.1016/S2589-7500(24)00246-2.
4
Professional, structural and organisational interventions in primary care for reducing medication errors.在初级保健中采取专业、结构和组织干预措施以减少用药错误。
Cochrane Database Syst Rev. 2017 Oct 4;10(10):CD003942. doi: 10.1002/14651858.CD003942.pub3.
5
Are Current Survival Prediction Tools Useful When Treating Subsequent Skeletal-related Events From Bone Metastases?当前的生存预测工具在治疗骨转移后的骨骼相关事件时有用吗?
Clin Orthop Relat Res. 2024 Sep 1;482(9):1710-1721. doi: 10.1097/CORR.0000000000003030. Epub 2024 Mar 22.
6
Systematic review and validation of prediction rules for identifying children with serious infections in emergency departments and urgent-access primary care.系统评价和验证预测规则,以识别急诊科和紧急初级保健中严重感染的儿童。
Health Technol Assess. 2012;16(15):1-100. doi: 10.3310/hta16150.
7
A rapid and systematic review of the clinical effectiveness and cost-effectiveness of paclitaxel, docetaxel, gemcitabine and vinorelbine in non-small-cell lung cancer.对紫杉醇、多西他赛、吉西他滨和长春瑞滨在非小细胞肺癌中的临床疗效和成本效益进行的快速系统评价。
Health Technol Assess. 2001;5(32):1-195. doi: 10.3310/hta5320.
8
Maternal and neonatal outcomes of elective induction of labor.择期引产的母婴结局
Evid Rep Technol Assess (Full Rep). 2009 Mar(176):1-257.
9
Intravenous magnesium sulphate and sotalol for prevention of atrial fibrillation after coronary artery bypass surgery: a systematic review and economic evaluation.静脉注射硫酸镁和索他洛尔预防冠状动脉搭桥术后房颤:系统评价与经济学评估
Health Technol Assess. 2008 Jun;12(28):iii-iv, ix-95. doi: 10.3310/hta12280.
10
How lived experiences of illness trajectories, burdens of treatment, and social inequalities shape service user and caregiver participation in health and social care: a theory-informed qualitative evidence synthesis.疾病轨迹的生活经历、治疗负担和社会不平等如何影响服务使用者和照顾者参与健康和社会护理:一项基于理论的定性证据综合分析
Health Soc Care Deliv Res. 2025 Jun;13(24):1-120. doi: 10.3310/HGTQ8159.

本文引用的文献

1
Assessing the potential of GPT-4 to perpetuate racial and gender biases in health care: a model evaluation study.评估 GPT-4 在医疗保健中延续种族和性别偏见的潜力:一项模型评估研究。
Lancet Digit Health. 2024 Jan;6(1):e12-e22. doi: 10.1016/S2589-7500(23)00225-X.
2
Comparison of History of Present Illness Summaries Generated by a Chatbot and Senior Internal Medicine Residents.聊天机器人与内科住院医师生成的现病史摘要比较
JAMA Intern Med. 2023 Sep 1;183(9):1026-1027. doi: 10.1001/jamainternmed.2023.2561.
3
Large language models encode clinical knowledge.
大语言模型编码临床知识。
Nature. 2023 Aug;620(7972):172-180. doi: 10.1038/s41586-023-06291-2. Epub 2023 Jul 12.
4
Health system-scale language models are all-purpose prediction engines.健康系统规模的语言模型是通用的预测引擎。
Nature. 2023 Jul;619(7969):357-362. doi: 10.1038/s41586-023-06160-y. Epub 2023 Jun 7.
5
What's in a Summary? Laying the Groundwork for Advances in Hospital-Course Summarization.摘要中有什么?为住院病程总结的进展奠定基础。
Proc Conf. 2021 Jun;2021:4794-4811. doi: 10.18653/v1/2021.naacl-main.382.
6
The Linguistic Effects of Context Specificity: Exploring Affect, Cognitive Processing, and Agency in Physicians' Think-Aloud Reflections.语境特异性的语言效应:探索医生出声思维反思中的情感、认知加工和主体能动性。
Diagnosis (Berl). 2020 Aug 27;7(3):273-280. doi: 10.1515/dx-2019-0103.
7
Association of Clinical Characteristics With Variation in Emergency Physician Preferences for Patients.临床特征与急诊医师对患者偏好的差异关联。
JAMA Netw Open. 2020 Jan 3;3(1):e1919607. doi: 10.1001/jamanetworkopen.2019.19607.
8
High-throughput phenotyping with electronic medical record data using a common semi-supervised approach (PheCAP).使用一种常见的半监督方法(PheCAP)对电子病历数据进行高通量表型分析。
Nat Protoc. 2019 Dec;14(12):3426-3444. doi: 10.1038/s41596-019-0227-6. Epub 2019 Nov 20.
9
Multitask learning and benchmarking with clinical time series data.多任务学习与临床时间序列数据的基准测试。
Sci Data. 2019 Jun 17;6(1):96. doi: 10.1038/s41597-019-0103-9.
10
Natural Language Processing of Clinical Notes on Chronic Diseases: Systematic Review.慢性病临床记录的自然语言处理:系统综述
JMIR Med Inform. 2019 Apr 27;7(2):e12239. doi: 10.2196/12239.