• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

使用自然语言处理技术自动评估反馈质量:来自 3 个外科住院医师培训项目的研究结果。

Using Natural Language Processing to Automatically Assess Feedback Quality: Findings From 3 Surgical Residencies.

机构信息

E. Ötleş is Medical Scientist Training Program fellow, Department of Industrial and Operations Engineering, University of Michigan Medical School, Ann Arbor, Michigan.

D.E. Kendrick is assistant professor, Department of Surgery, University of Minnesota Medical School, Minneapolis, Minnesota.

出版信息

Acad Med. 2021 Oct 1;96(10):1457-1460. doi: 10.1097/ACM.0000000000004153.

DOI:10.1097/ACM.0000000000004153
PMID:33951682
Abstract

PURPOSE

Learning is markedly improved with high-quality feedback, yet assuring the quality of feedback is difficult to achieve at scale. Natural language processing (NLP) algorithms may be useful in this context as they can automatically classify large volumes of narrative data. However, it is unknown if NLP models can accurately evaluate surgical trainee feedback. This study evaluated which NLP techniques best classify the quality of surgical trainee formative feedback recorded as part of a workplace assessment.

METHOD

During the 2016-2017 academic year, the SIMPL (Society for Improving Medical Professional Learning) app was used to record operative performance narrative feedback for residents at 3 university-based general surgery residency training programs. Feedback comments were collected for a sample of residents representing all 5 postgraduate year levels and coded for quality. In May 2019, the coded comments were then used to train NLP models to automatically classify the quality of feedback across 4 categories (effective, mediocre, ineffective, or other). Models included support vector machines (SVM), logistic regression, gradient boosted trees, naive Bayes, and random forests. The primary outcome was mean classification accuracy.

RESULTS

The authors manually coded the quality of 600 recorded feedback comments. Those data were used to train NLP models to automatically classify the quality of feedback across 4 categories. The NLP model using an SVM algorithm yielded a maximum mean accuracy of 0.64 (standard deviation, 0.01). When the classification task was modified to distinguish only high-quality vs low-quality feedback, maximum mean accuracy was 0.83, again with SVM.

CONCLUSIONS

To the authors' knowledge, this is the first study to examine the use of NLP for classifying feedback quality. SVM NLP models demonstrated the ability to automatically classify the quality of surgical trainee evaluations. Larger training datasets would likely further increase accuracy.

摘要

目的

高质量的反馈显著提高学习效果,但要确保反馈的质量在大规模应用中是难以实现的。自然语言处理(NLP)算法在这种情况下可能很有用,因为它们可以自动对大量叙述性数据进行分类。然而,目前还不清楚 NLP 模型是否可以准确评估外科学员的反馈。本研究评估了 NLP 技术在多大程度上可以准确分类作为工作场所评估一部分记录的外科学员形成性反馈的质量。

方法

在 2016-2017 学年期间,SIMPL(改善医学专业学习协会)应用程序被用于记录 3 个大学普通外科住院医师培训项目中住院医师的手术表现叙述性反馈。收集了一个代表所有 5 个住院医师后阶段的住院医师样本的反馈意见,并对其质量进行编码。2019 年 5 月,然后使用编码的反馈意见来训练 NLP 模型,以自动将反馈质量分为 4 个类别(有效、中等、无效或其他)。模型包括支持向量机(SVM)、逻辑回归、梯度提升树、朴素贝叶斯和随机森林。主要结果是平均分类准确率。

结果

作者手动对 600 条记录的反馈质量进行了编码。这些数据被用于训练 NLP 模型,以自动将反馈质量分为 4 个类别。使用 SVM 算法的 NLP 模型的最大平均准确率为 0.64(标准差为 0.01)。当分类任务修改为仅区分高质量和低质量反馈时,最大平均准确率为 0.83,再次使用 SVM。

结论

据作者所知,这是第一项研究使用 NLP 对反馈质量进行分类的研究。SVM NLP 模型展示了自动分类外科学员评估质量的能力。更大的训练数据集可能会进一步提高准确性。

相似文献

1
Using Natural Language Processing to Automatically Assess Feedback Quality: Findings From 3 Surgical Residencies.使用自然语言处理技术自动评估反馈质量:来自 3 个外科住院医师培训项目的研究结果。
Acad Med. 2021 Oct 1;96(10):1457-1460. doi: 10.1097/ACM.0000000000004153.
2
Natural Language Processing and Assessment of Resident Feedback Quality.自然语言处理与住院医师反馈质量评估。
J Surg Educ. 2021 Nov-Dec;78(6):e72-e77. doi: 10.1016/j.jsurg.2021.05.012. Epub 2021 Jun 21.
3
The Quality of Operative Performance Narrative Feedback: A Retrospective Data Comparison Between End of Rotation Evaluations and Workplace-based Assessments.手术操作表现叙述性反馈质量:轮转结束评估与基于实践场所评估的回顾性数据比较。
Ann Surg. 2022 Mar 1;275(3):617-620. doi: 10.1097/SLA.0000000000003907.
4
Feedback-Seeking Behavior and Practice Readiness for General Surgery.寻求反馈行为与普通外科实践准备度。
J Surg Educ. 2022 Mar-Apr;79(2):295-301. doi: 10.1016/j.jsurg.2021.10.003. Epub 2021 Nov 25.
5
Using Natural Language Processing to Evaluate the Quality of Supervisor Narrative Comments in Competency-Based Medical Education.使用自然语言处理评估基于能力的医学教育中导师叙事性反馈的质量。
Acad Med. 2024 May 1;99(5):534-540. doi: 10.1097/ACM.0000000000005634. Epub 2024 Jan 12.
6
SIMPLifying Urology Residency Operative Assessments: A Pilot Study in Urology Training.泌尿科住院医师手术评估的简化:泌尿科培训中的一项试点研究。
J Urol. 2021 Oct;206(4):1009-1019. doi: 10.1097/JU.0000000000001874. Epub 2021 May 25.
7
Association of Gender and Operative Feedback Quality in Surgical Residents.外科住院医师中性别与手术反馈质量的关联。
J Surg Educ. 2023 Nov;80(11):1516-1521. doi: 10.1016/j.jsurg.2023.06.004. Epub 2023 Jun 27.
8
Four-Year Analysis of a Novel Milestone-Based Assessment of Faculty by General Surgical Residents.基于里程碑的普通外科住院医师对教员进行的四年分析。
J Surg Educ. 2018 Nov;75(6):e126-e133. doi: 10.1016/j.jsurg.2018.08.008. Epub 2018 Sep 15.
9
Examining the Impact of Using the SIMPL Application on Feedback in Surgical Education.探讨使用 SIMPL 应用程序对手术教学中反馈的影响。
J Surg Educ. 2018 Nov;75(6):e246-e254. doi: 10.1016/j.jsurg.2018.08.009. Epub 2018 Sep 11.
10
Quality of Operative Performance Feedback Given to Thoracic Surgery Residents Using an App-Based System.使用基于应用程序的系统向胸外科住院医师提供的手术操作表现反馈质量
J Surg Educ. 2017 Nov-Dec;74(6):e81-e87. doi: 10.1016/j.jsurg.2017.10.001.

引用本文的文献

1
Artificial Intelligence in Medical Education: a Scoping Review of the Evidence for Efficacy and Future Directions.医学教育中的人工智能:疗效证据及未来方向的范围综述
Med Sci Educ. 2025 Apr 2;35(3):1803-1816. doi: 10.1007/s40670-025-02373-0. eCollection 2025 Jun.
2
Modern artificial intelligence and large language models in graduate medical education: a scoping review of attitudes, applications & practice.研究生医学教育中的现代人工智能与大语言模型:态度、应用及实践的范围综述
BMC Med Educ. 2025 May 20;25(1):730. doi: 10.1186/s12909-025-07321-5.
3
The Role of Natural Language Processing in Graduate Medical Education: A Scoping Review.
自然语言处理在毕业后医学教育中的作用:一项范围综述
Cureus. 2025 Mar 24;17(3):e81078. doi: 10.7759/cureus.81078. eCollection 2025 Mar.
4
EMYWAY Workplace-Based Entrustable Professional Activities Assessments in Otolaryngology Residency Training: A Nationwide Experience.耳鼻喉科住院医师培训中基于工作场所的可托付专业活动评估:一项全国性经验
Otolaryngol Head Neck Surg. 2025 Apr;172(4):1242-1253. doi: 10.1002/ohn.1104. Epub 2024 Dec 31.
5
Finding the Needle in the Haystack: Can Natural Language Processing of Students' Evaluations of Teachers Identify Teaching Concerns?大海捞针:学生对教师的评价进行自然语言处理能否识别教学问题?
J Gen Intern Med. 2025 Jan;40(1):119-123. doi: 10.1007/s11606-024-08990-6. Epub 2024 Aug 21.
6
Assessing the Ability of a Large Language Model to Score Free-Text Medical Student Clinical Notes: Quantitative Study.评估大型语言模型对自由文本医学生临床笔记评分的能力:定量研究。
JMIR Med Educ. 2024 Jul 25;10:e56342. doi: 10.2196/56342.
7
Applications of natural language processing tools in the surgical journey.自然语言处理工具在手术过程中的应用。
Front Surg. 2024 May 17;11:1403540. doi: 10.3389/fsurg.2024.1403540. eCollection 2024.
8
The Surgical Clerkship in the COVID Era: A Natural Language Processing and Thematic Analysis.新冠疫情时期的外科技能实习:自然语言处理与主题分析。
J Surg Res. 2024 Jul;299:155-162. doi: 10.1016/j.jss.2024.04.016. Epub 2024 May 16.
9
Automating the Identification of Feedback Quality Criteria and the CanMEDS Roles in Written Feedback Comments Using Natural Language Processing.使用自然语言处理自动化识别反馈质量标准和书面反馈意见中的 CanMEDS 角色。
Perspect Med Educ. 2023 Dec 18;12(1):540-549. doi: 10.5334/pme.1056. eCollection 2023.
10
Innovations in surgical training: exploring the role of artificial intelligence and large language models (LLM).外科培训创新:探索人工智能和大语言模型(LLM)的作用。
Rev Col Bras Cir. 2023 Aug 25;50:e20233605. doi: 10.1590/0100-6991e-20233605-en. eCollection 2023.