使用自然语言处理自动化识别反馈质量标准和书面反馈意见中的 CanMEDS 角色。

Automating the Identification of Feedback Quality Criteria and the CanMEDS Roles in Written Feedback Comments Using Natural Language Processing.

机构信息

Department of Educational Sciences at Ghent University, Belgium.

Language and Translation Technology Team at Ghent University, Belgium.

出版信息

Perspect Med Educ. 2023 Dec 18;12(1):540-549. doi: 10.5334/pme.1056. eCollection 2023.

DOI:10.5334/pme.1056

PMID:38144670

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10742245/

Abstract

INTRODUCTION

Manually analysing the quality of large amounts of written feedback comments is time-consuming and demands extensive resources and human effort. Therefore, this study aimed to explore whether a state-of-the-art large language model (LLM) could be fine-tuned to identify the presence of four literature-derived feedback quality criteria ( and ) and the seven CanMEDS roles ( and ) in written feedback comments.

METHODS

A set of 2,349 labelled feedback comments of five healthcare educational programs in Flanders (Belgium) (specialistic medicine, general practice, midwifery, speech therapy and occupational therapy) was split into 12,452 sentences to create two datasets for the machine learning analysis. The Dutch BERT models BERTje and RobBERT were used to train four multiclass-multilabel classification models: two to identify the four feedback quality criteria and two to identify the seven CanMEDS roles.

RESULTS

The classification models trained with BERTje and RobBERT to predict the presence of the four feedback quality criteria attained macro average F1-scores of 0.73 and 0.76, respectively. The F1-score of the model predicting the presence of the CanMEDS roles trained with BERTje was 0.71 and 0.72 with RobBERT.

DISCUSSION

The results showed that a state-of-the-art LLM is able to identify the presence of the four feedback quality criteria and the CanMEDS roles in written feedback comments. This implies that the quality analysis of written feedback comments can be automated using an LLM, leading to savings of time and resources.

摘要

简介

手动分析大量书面反馈意见的质量既费时又费力，需要大量的资源和人力。因此，本研究旨在探讨是否可以对最先进的大型语言模型（LLM）进行微调，以识别四种文献衍生的反馈质量标准（和）和七种 CanMEDS 角色（和）在书面反馈意见中。

方法

一组来自比利时佛兰德斯（Flanders）的五个医疗保健教育项目的 2349 条带标签的反馈意见（专业医学、全科医学、助产学、言语治疗和职业治疗）被分成 12452 个句子，以创建两个机器学习分析数据集。使用荷兰的 BERT 模型 BERTje 和 RobBERT 来训练四个多类多标签分类模型：两个用于识别四个反馈质量标准，两个用于识别七个 CanMEDS 角色。