Suppr超能文献

人工智能驱动的外科口腔检查模拟器的开发与评估:一项试点研究。

Development and Evaluation of an Artificial Intelligence-Powered Surgical Oral Examination Simulator: A Pilot Study.

作者信息

Rao Arya S, Prasad Siona, Lee Richard S, Farrell Susan, McKinley Sophia, Succi Marc D

机构信息

Harvard Medical School, Boston, MA.

Medically Engineered Solutions in Healthcare Incubator, Innovation in Operations Research Center (MESH IO), Mass General Brigham, Boston, MA.

出版信息

Mayo Clin Proc Digit Health. 2025 Jun 9;3(3):100241. doi: 10.1016/j.mcpdig.2025.100241. eCollection 2025 Sep.

Abstract

OBJECTIVE

To develop and validate an artificial intelligence-powered platform that simulates surgical oral examinations, addressing the limitations of traditional faculty-led sessions.

PATIENTS AND METHODS

This cross-sectional study, conducted from June 1, 2024, through December 1, 2024, comprised technical validation and educational assessment of a novel large language model (LLM)-based surgical education tool (surgery oral examination large language model [SOE-LLM]). The study involved 12 surgical clerkship students completing their core rotation at a major academic medical center. The SOE-LLM, using MIMIC-IV-derived surgical cases (acute appendicitis and pancreatitis), was implemented to simulate oral examinations. Technical validation assessed performance across 8 domains: case presentation accuracy, physical examination findings, historical detail preservation, laboratory data reporting, imaging interpretation, management decisions, and recognition of contraindicated interventions. Educational utility was evaluated using a 5-point Likert scale.

RESULTS

Technical validation showed the SOE-LLM's ability to function as a consistent oral examiner. The model accurately guided students through case presentations, responded to diagnostic questions, and provided clinically sound responses based on MIMIC-IV cases. When tested with standardized prompts, it maintained examination fidelity, requiring proper diagnostic reasoning and differentiating operative versus medical management. Student evaluations highlighted the platform's value as an examination preparation tool (mean, 4.250; SEM, 0.1794) and its ability to create a low-stakes environment for high-stakes decision practice (mean, 4.833; SEM, 0.1124).

CONCLUSION

The SOE-LLM shows potential as a valuable tool for surgical education, offering a consistent and accessible platform for simulating oral examinations.

摘要

目的

开发并验证一个由人工智能驱动的模拟外科口腔检查的平台,以解决传统教师主导课程的局限性。

患者与方法

这项横断面研究于2024年6月1日至2024年12月1日进行,包括对一种基于新型大语言模型(LLM)的外科教育工具(外科口腔检查大语言模型[SOE-LLM])进行技术验证和教育评估。该研究涉及12名在一家主要学术医疗中心完成核心轮转的外科实习学生。使用源自MIMIC-IV的外科病例(急性阑尾炎和胰腺炎)的SOE-LLM被用于模拟口腔检查。技术验证评估了8个领域的表现:病例呈现准确性、体格检查结果、病史细节保留、实验室数据报告、影像解读、管理决策以及对禁忌干预的识别。教育效用使用5点李克特量表进行评估。

结果

技术验证表明SOE-LLM具备作为一致的口腔考官发挥作用的能力。该模型准确地引导学生进行病例呈现,回答诊断问题,并基于MIMIC-IV病例提供临床合理的回答。当使用标准化提示进行测试时,它保持了检查的保真度,要求进行适当的诊断推理并区分手术与药物管理。学生评价突出了该平台作为考试准备工具的价值(均值为4.250;标准误为0.1794)以及其为高风险决策实践创造低风险环境的能力(均值为4.833;标准误为0.1124)。

结论

SOE-LLM显示出作为外科教育有价值工具的潜力,为模拟口腔检查提供了一个一致且可及的平台。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2719/12270061/11875f2deac3/gr1.jpg

相似文献

1
Development and Evaluation of an Artificial Intelligence-Powered Surgical Oral Examination Simulator: A Pilot Study.
Mayo Clin Proc Digit Health. 2025 Jun 9;3(3):100241. doi: 10.1016/j.mcpdig.2025.100241. eCollection 2025 Sep.
6
Artificial intelligence for detecting keratoconus.
Cochrane Database Syst Rev. 2023 Nov 15;11(11):CD014911. doi: 10.1002/14651858.CD014911.pub2.
8
Artificial intelligence for diagnosing exudative age-related macular degeneration.
Cochrane Database Syst Rev. 2024 Oct 17;10(10):CD015522. doi: 10.1002/14651858.CD015522.pub2.
10
Pharmacy meets AI: Effect of a drug information activity on student perceptions of generative artificial intelligence.
Curr Pharm Teach Learn. 2025 Jul 7;17(10):102439. doi: 10.1016/j.cptl.2025.102439.

本文引用的文献

1
Building the AI-Enabled Medical School of the Future.
JAMA. 2025 May 20;333(19):1665-1666. doi: 10.1001/jama.2025.2789.
2
A case study on using a large language model to analyze continuous glucose monitoring data.
Sci Rep. 2025 Jan 7;15(1):1143. doi: 10.1038/s41598-024-84003-0.
3
4
Performance of Publicly Available Large Language Models on Internal Medicine Board-style Questions.
PLOS Digit Health. 2024 Sep 17;3(9):e0000604. doi: 10.1371/journal.pdig.0000604. eCollection 2024 Sep.
5
Racial, ethnic, and sex bias in large language model opioid recommendations for pain management.
Pain. 2025 Mar 1;166(3):511-517. doi: 10.1097/j.pain.0000000000003388. Epub 2024 Sep 6.
6
Diagnostic Accuracy of a Custom Large Language Model on Rare Pediatric Disease Case Reports.
Am J Med Genet A. 2025 Feb;197(2):e63878. doi: 10.1002/ajmg.a.63878. Epub 2024 Sep 13.
10
Evaluating GPT as an Adjunct for Radiologic Decision Making: GPT-4 Versus GPT-3.5 in a Breast Imaging Pilot.
J Am Coll Radiol. 2023 Oct;20(10):990-997. doi: 10.1016/j.jacr.2023.05.003. Epub 2023 Jun 21.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验