Glicksman Michael, Wang Sheri, Yellapragada Samir, Robinson Christopher, Orhurhu Vwaire, Emerick Trent
Department of Physical Medicine and Rehabilitation, University of Pittsburgh Medical Center (UPMC), Pittsburgh, Pennsylvania, USA.
Department of Anesthesiology and Perioperative Medicine, University of Pittsburgh Medical Center (UPMC), Pittsburgh, Pennsylvania, USA.
Pain Pract. 2025 Jan;25(1):e13428. doi: 10.1111/papr.13428. Epub 2024 Nov 26.
Artificial intelligence (AI) represents an exciting and evolving technology that is increasingly being utilized across pain medicine. Large language models (LLMs) are one type of AI that has become particularly popular. Currently, there is a paucity of literature analyzing the impact that AI may have on trainee education. As such, we sought to assess the benefits and pitfalls that AI may have on pain medicine trainee education. Given the rapidly increasing popularity of LLMs, we particularly assessed how these LLMs may promote and hinder trainee education through a pilot quality improvement project.
A comprehensive search of the existing literature regarding AI within medicine was performed to identify its potential benefits and pitfalls within pain medicine. The pilot project was approved by UPMC Quality Improvement Review Committee (#4547). Three of the most commonly utilized LLMs at the initiation of this pilot study - ChatGPT Plus, Google Bard, and Bing AI - were asked a series of multiple choice questions to evaluate their ability to assist in learner education within pain medicine.
Potential benefits of AI within pain medicine trainee education include ease of use, imaging interpretation, procedural/surgical skills training, learner assessment, personalized learning experiences, ability to summarize vast amounts of knowledge, and preparation for the future of pain medicine. Potential pitfalls include discrepancies between AI devices and associated cost-differences, correlating radiographic findings to clinical significance, interpersonal/communication skills, educational disparities, bias/plagiarism/cheating concerns, lack of incorporation of private domain literature, and absence of training specifically for pain medicine education. Regarding the quality improvement project, ChatGPT Plus answered the highest percentage of all questions correctly (16/17). Lowest correctness scores by LLMs were in answering first-order questions, with Google Bard and Bing AI answering 4/9 and 3/9 first-order questions correctly, respectively. Qualitative evaluation of these LLM-provided explanations in answering second- and third-order questions revealed some reasoning inconsistencies (e.g., providing flawed information in selecting the correct answer).
AI represents a continually evolving and promising modality to assist trainees pursuing a career in pain medicine. Still, limitations currently exist that may hinder their independent use in this setting. Future research exploring how AI may overcome these challenges is thus required. Until then, AI should be utilized as supplementary tool within pain medicine trainee education and with caution.
人工智能(AI)是一项令人兴奋且不断发展的技术,在疼痛医学领域的应用日益广泛。大语言模型(LLMs)是人工智能的一种,已变得特别流行。目前,分析人工智能对实习医生教育可能产生的影响的文献较少。因此,我们试图评估人工智能对疼痛医学实习医生教育可能带来的益处和弊端。鉴于大语言模型的迅速普及,我们通过一个试点质量改进项目特别评估了这些大语言模型如何促进和阻碍实习医生教育。
对医学领域内关于人工智能的现有文献进行全面检索,以确定其在疼痛医学中的潜在益处和弊端。该试点项目获得了匹兹堡大学医学中心质量改进审查委员会(#4547)的批准。在这项试点研究开始时,向三个最常用的大语言模型——ChatGPT Plus、谷歌巴德和必应人工智能——提出了一系列多项选择题,以评估它们在疼痛医学中协助学习者教育的能力。
人工智能在疼痛医学实习医生教育中的潜在益处包括使用方便、影像解读、程序/手术技能培训、学习者评估、个性化学习体验、总结大量知识的能力以及为疼痛医学的未来做好准备。潜在弊端包括人工智能设备之间的差异以及相关成本差异、将影像学检查结果与临床意义相关联、人际/沟通技能、教育差距、偏见/抄袭/作弊问题、未纳入私人领域文献以及缺乏专门针对疼痛医学教育的培训。关于质量改进项目,ChatGPT Plus正确回答的问题占比最高(16/17)。大语言模型在回答一阶问题时的正确率最低,谷歌巴德和必应人工智能分别正确回答了4/9和3/9的一阶问题。对这些大语言模型在回答二阶和三阶问题时提供的解释进行定性评估,发现了一些推理不一致的情况(例如,在选择正确答案时提供有缺陷的信息)。
人工智能是一种不断发展且有前景的方式,可帮助有志于从事疼痛医学职业的实习医生。然而,目前存在的局限性可能会阻碍其在这种情况下的独立使用。因此,需要未来的研究探索人工智能如何克服这些挑战。在此之前,人工智能在疼痛医学实习医生教育中应谨慎用作辅助工具。