ChatGPT与豆包在解答有关膝骨关节炎和全膝关节置换术患者问题方面的疗效比较

Comparative Efficacy of ChatGPT and DeepSeek in Addressing Patient Queries on Gonarthrosis and Total Knee Arthroplasty.

作者信息

Gurbuz Serhat, Bahar Hakan, Yavuz Ulas, Keskin Ahmet, Karslioglu Bulent, Solak Yener

机构信息

Department of Orthopedics and Traumatology, Baltalimanı Bone Diseases Training and Research Hospital, Istanbul, Turkey.

出版信息

Arthroplast Today. 2025 Jun 2;33:101730. doi: 10.1016/j.artd.2025.101730. eCollection 2025 Jun.

DOI:10.1016/j.artd.2025.101730

PMID:40521295

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12167088/

Abstract

BACKGROUND

The advent of artificial intelligence (AI) in healthcare has opened new avenues for patient education and anxiety reduction. This study aims to compare the efficacy of 2 prominent AI platforms, ChatGPT and DeepSeek, in providing accurate and satisfactory responses to patients with gonarthrosis contemplating total knee arthroplasty (TKA).

METHODS

A prospective, comparative trial was conducted involving 100 patients diagnosed with gonarthrosis and indicated for TKA. Each patient posed 5 questions regarding the surgery and postoperative rehabilitation to both ChatGPT and DeepSeek. Responses were evaluated by 2 blinded orthopaedic specialists on a 10-point scale for accuracy and patient satisfaction. Patients also rated their satisfaction with each response on a 10-point scale. The primary outcome measures were the mean accuracy scores from specialists and mean satisfaction scores from patients.

RESULTS

Statistical analysis revealed significant differences between ChatGPT and DeepSeek in both accuracy and patient satisfaction ( < .001). ChatGPT demonstrated superior performance with a mean accuracy score of 8.7 ± 0.9 compared to DeepSeek's 7.4 ± 1.2. Patient satisfaction scores aligned with expert evaluations, with ChatGPT achieving a mean satisfaction score of 8.9 ± 0.8 vs DeepSeek's 7.6 ± 1.1. Notably, ChatGPT excelled in providing comprehensive explanations of surgical procedures (mean score 9.2 ± 0.7) and postoperative care (9.1 ± 0.8), while DeepSeek performed better in offering concise summaries of recovery timelines (8.3 ± 0.9).

CONCLUSIONS

This study demonstrates that ChatGPT offers more accurate and satisfactory responses to patient queries regarding gonarthrosis and TKA compared to DeepSeek. The findings suggest that AI platforms, particularly ChatGPT, can serve as valuable tools in augmenting patient education and potentially reducing preoperative anxiety. Future research should explore the integration of AI-assisted information delivery in clinical practice and its long-term impact on patient outcomes.

摘要

背景

人工智能（AI）在医疗保健领域的出现为患者教育和减轻焦虑开辟了新途径。本研究旨在比较两个著名的人工智能平台ChatGPT和豆包，在为考虑全膝关节置换术（TKA）的膝骨关节炎患者提供准确且令人满意的回答方面的效果。

方法

进行了一项前瞻性比较试验，纳入100例被诊断为膝骨关节炎且适合进行TKA的患者。每位患者就手术及术后康复向ChatGPT和豆包各提出5个问题。由2名盲法骨科专家以10分制对回答的准确性和患者满意度进行评估。患者也以10分制对每个回答的满意度进行评分。主要结局指标为专家的平均准确性评分和患者的平均满意度评分。

结果

统计分析显示，ChatGPT和豆包在准确性和患者满意度方面均存在显著差异（P <.001）。ChatGPT表现更优，平均准确性评分为8.7±0.9，而豆包为7.4±1.2。患者满意度评分与专家评估结果一致，ChatGPT的平均满意度评分为8.9±0.8，豆包为7.6±1.1。值得注意的是，ChatGPT在提供手术过程的全面解释（平均得分9.2±0.7）和术后护理（9.1±0.8）方面表现出色，而豆包在提供恢复时间线的简洁总结方面表现更好（8.3±0.9）。