Rao Arya S, Prasad Siona, Lee Richard S, Farrell Susan, McKinley Sophia, Succi Marc D
Harvard Medical School, Boston, MA.
Medically Engineered Solutions in Healthcare Incubator, Innovation in Operations Research Center (MESH IO), Mass General Brigham, Boston, MA.
Mayo Clin Proc Digit Health. 2025 Jun 9;3(3):100241. doi: 10.1016/j.mcpdig.2025.100241. eCollection 2025 Sep.
To develop and validate an artificial intelligence-powered platform that simulates surgical oral examinations, addressing the limitations of traditional faculty-led sessions.
This cross-sectional study, conducted from June 1, 2024, through December 1, 2024, comprised technical validation and educational assessment of a novel large language model (LLM)-based surgical education tool (surgery oral examination large language model [SOE-LLM]). The study involved 12 surgical clerkship students completing their core rotation at a major academic medical center. The SOE-LLM, using MIMIC-IV-derived surgical cases (acute appendicitis and pancreatitis), was implemented to simulate oral examinations. Technical validation assessed performance across 8 domains: case presentation accuracy, physical examination findings, historical detail preservation, laboratory data reporting, imaging interpretation, management decisions, and recognition of contraindicated interventions. Educational utility was evaluated using a 5-point Likert scale.
Technical validation showed the SOE-LLM's ability to function as a consistent oral examiner. The model accurately guided students through case presentations, responded to diagnostic questions, and provided clinically sound responses based on MIMIC-IV cases. When tested with standardized prompts, it maintained examination fidelity, requiring proper diagnostic reasoning and differentiating operative versus medical management. Student evaluations highlighted the platform's value as an examination preparation tool (mean, 4.250; SEM, 0.1794) and its ability to create a low-stakes environment for high-stakes decision practice (mean, 4.833; SEM, 0.1124).
The SOE-LLM shows potential as a valuable tool for surgical education, offering a consistent and accessible platform for simulating oral examinations.
开发并验证一个由人工智能驱动的模拟外科口腔检查的平台,以解决传统教师主导课程的局限性。
这项横断面研究于2024年6月1日至2024年12月1日进行,包括对一种基于新型大语言模型(LLM)的外科教育工具(外科口腔检查大语言模型[SOE-LLM])进行技术验证和教育评估。该研究涉及12名在一家主要学术医疗中心完成核心轮转的外科实习学生。使用源自MIMIC-IV的外科病例(急性阑尾炎和胰腺炎)的SOE-LLM被用于模拟口腔检查。技术验证评估了8个领域的表现:病例呈现准确性、体格检查结果、病史细节保留、实验室数据报告、影像解读、管理决策以及对禁忌干预的识别。教育效用使用5点李克特量表进行评估。
技术验证表明SOE-LLM具备作为一致的口腔考官发挥作用的能力。该模型准确地引导学生进行病例呈现,回答诊断问题,并基于MIMIC-IV病例提供临床合理的回答。当使用标准化提示进行测试时,它保持了检查的保真度,要求进行适当的诊断推理并区分手术与药物管理。学生评价突出了该平台作为考试准备工具的价值(均值为4.250;标准误为0.1794)以及其为高风险决策实践创造低风险环境的能力(均值为4.833;标准误为0.1124)。
SOE-LLM显示出作为外科教育有价值工具的潜力,为模拟口腔检查提供了一个一致且可及的平台。