New York Institute of Technology - College of Osteopathic Medicine, Old Westbury, NY, USA.
Department of Orthopaedic Surgery, Northwell Health, Donald and Barbara Zucker School of Medicine, Huntington, NY, USA.
J Shoulder Elbow Surg. 2024 Aug;33(8):e429-e437. doi: 10.1016/j.jse.2023.11.014. Epub 2024 Jan 3.
BACKGROUND: Artificial intelligence (AI) and large language models (LLMs) offer a new potential resource for patient education. The answers by Chat Generative Pre-Trained Transformer (ChatGPT), a LLM AI text bot, to frequently asked questions (FAQs) were compared to answers provided by a contemporary Google search to determine the reliability of information provided by these sources for patient education in upper extremity arthroplasty. METHODS: "Total shoulder arthroplasty" (TSA) and "total elbow arthroplasty" (TEA) were entered into Google Search and ChatGPT 3.0 to determine the ten most FAQs. On Google, the FAQs were obtained through the "people also ask" section, while ChatGPT was asked to provide the ten most FAQs. Each question, answer, and reference(s) cited were recorded. A modified version of the Rothwell system was used to categorize questions into 10 subtopics: special activities, timeline of recovery, restrictions, technical details, cost, indications/management, risks and complications, pain, longevity, and evaluation of surgery. Each reference was categorized into the following groups: commercial, academic, medical practice, single surgeon personal, or social media. Questions for TSA and TEA were combined for analysis and compared between Google and ChatGPT with a 2 sample Z-test for proportions. RESULTS: Overall, most questions were related to procedural indications or management (17.5%). There were no significant differences between Google and ChatGPT between question categories. The majority of references were from academic websites (65%). ChatGPT produced a greater number of academic references compared to Google (80% vs. 50%; P = .047), while Google more commonly provided medical practice references (25% vs. 0%; P = .017). CONCLUSION: In conjunction with patient-physician discussions, AI LLMs may provide a reliable resource for patients. By providing information based on academic references, these tools have the potential to improve health literacy and improved shared decision making for patients searching for information about TSA and TEA. CLINICAL SIGNIFICANCE: With the rising prevalence of AI programs, it is essential to understand how these applications affect patient education in medicine.
背景:人工智能(AI)和大型语言模型(LLM)为患者教育提供了新的潜在资源。将 Chat Generative Pre-Trained Transformer(ChatGPT),一种 LLM AI 文本机器人的常见问题解答(FAQ)与当代 Google 搜索提供的答案进行比较,以确定这些来源为上肢关节置换术患者教育提供的信息的可靠性。
方法:在 Google 搜索和 ChatGPT 3.0 中输入“全肩关节置换术”(TSA)和“全肘关节置换术”(TEA),以确定十个最常见的 FAQ。在 Google 上,通过“人们也在问”部分获取 FAQ,而要求 ChatGPT 提供十个最常见的 FAQ。记录每个问题、答案和引用的参考文献。使用修改后的 Rothwell 系统将问题分为 10 个子主题:特殊活动、恢复时间线、限制、技术细节、成本、适应证/管理、风险和并发症、疼痛、寿命和手术评估。每个参考文献归入以下类别:商业、学术、医疗实践、单一外科医生个人或社交媒体。将 TSA 和 TEA 的问题合并进行分析,并在 Google 和 ChatGPT 之间进行 2 样本 Z 检验比较比例。
结果:总体而言,大多数问题与手术适应证或管理有关(17.5%)。Google 和 ChatGPT 之间在问题类别上没有显着差异。参考文献主要来自学术网站(65%)。与 Google 相比,ChatGPT 提供了更多的学术参考文献(80% 对 50%;P=0.047),而 Google 更常见地提供医疗实践参考文献(25% 对 0%;P=0.017)。
结论:结合医患讨论,AI LLM 可能为患者提供可靠的资源。通过提供基于学术参考文献的信息,这些工具有可能提高健康素养,并为寻求 TSA 和 TEA 信息的患者改善共同决策。
临床意义:随着 AI 程序的普及,了解这些应用程序如何影响医学中的患者教育至关重要。
Clin Orthop Relat Res. 2024-4-1
Spine Deform. 2025-8-17
Patient Prefer Adherence. 2025-7-31
J Med Internet Res. 2024-11-15
Front Med (Lausanne). 2024-10-29