突破骨骼，突破障碍：ChatGPT、DeepSeek和Gemini在手部骨折管理中的应用

Breaking Bones, Breaking Barriers: ChatGPT, DeepSeek, and Gemini in Hand Fracture Management.

作者信息

Marcaccini Gianluca, Seth Ishith, Xie Yi, Susini Pietro, Pozzi Mirco, Cuomo Roberto, Rozen Warren M

机构信息

Plastic Surgery Unit, Department of Medicine, Surgery and Neuroscience, University of Siena, 53100 Siena, Italy.

Department of Plastic and Reconstructive Surgery, Peninsula Health, Frankston, VIC 3199, Australia.

出版信息

J Clin Med. 2025 Mar 14;14(6):1983. doi: 10.3390/jcm14061983.

DOI:10.3390/jcm14061983

PMID:40142791

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11942733/

Abstract

: Hand fracture management requires precise diagnostic accuracy and complex decision-making. Advances in artificial intelligence (AI) suggest that large language models (LLMs) may assist or even rival traditional clinical approaches. This study evaluates the effectiveness of ChatGPT-4o, DeepSeek-V3, and Gemini 1.5 in diagnosing and recommending treatment strategies for hand fractures compared to experienced surgeons. : A retrospective analysis of 58 anonymized hand fracture cases was conducted. Clinical details, including fracture site, displacement, and soft-tissue involvement, were provided to the AI models, which generated management plans. Their recommendations were compared to actual surgeon decisions, assessing accuracy, precision, recall, and F1 score. : ChatGPT-4o demonstrated the highest accuracy (98.28%) and recall (91.74%), effectively identifying most correct interventions but occasionally proposing extraneous options (precision 58.48%). DeepSeek-V3 showed moderate accuracy (63.79%), with balanced precision (61.17%) and recall (57.89%), sometimes omitting correct treatments. Gemini 1.5 performed poorly (accuracy 18.97%), with low precision and recall, indicating substantial limitations in clinical decision support. : AI models can enhance clinical workflows, particularly in radiographic interpretation and triage, but their limitations highlight the irreplaceable role of human expertise in complex hand trauma management. ChatGPT-4o demonstrated promising accuracy but requires refinement. Ethical concerns regarding AI-driven medical decisions, including bias and transparency, must be addressed before widespread clinical implementation.

摘要

手部骨折的处理需要精确的诊断准确性和复杂的决策。人工智能（AI）的进展表明，大语言模型（LLMs）可能辅助甚至媲美传统临床方法。本研究评估了ChatGPT-4o、DeepSeek-V3和Gemini 1.5在诊断手部骨折并推荐治疗策略方面与经验丰富的外科医生相比的有效性。

对58例匿名手部骨折病例进行了回顾性分析。将包括骨折部位、移位和软组织受累情况在内的临床细节提供给人工智能模型，这些模型生成了处理方案。将它们的建议与外科医生的实际决策进行比较，评估准确性、精确性、召回率和F1分数。

ChatGPT-4o表现出最高的准确性（98.28%）和召回率（91.74%），能有效识别出大多数正确的干预措施，但偶尔会提出无关选项（精确性58.48%）。DeepSeek-V3表现出中等准确性（63.79%），精确性（61.17%）和召回率（57.89%）较为平衡，有时会遗漏正确的治疗方法。Gemini 1.5表现较差（准确性18.97%），精确性和召回率较低，表明在临床决策支持方面存在重大局限性。

人工智能模型可以改善临床工作流程，特别是在影像学解读和分诊方面，但其局限性凸显了人类专业知识在复杂手部创伤处理中不可替代的作用。ChatGPT-4o表现出了有前景的准确性，但需要改进。在广泛临床应用之前，必须解决与人工智能驱动的医疗决策相关的伦理问题，包括偏差和透明度问题。

相似文献

Breaking Bones, Breaking Barriers: ChatGPT, DeepSeek, and Gemini in Hand Fracture Management.突破骨骼，突破障碍：ChatGPT、DeepSeek和Gemini在手部骨折管理中的应用

J Clin Med. 2025 Mar 14;14(6):1983. doi: 10.3390/jcm14061983.

Management of Dupuytren's Disease: A Multi-Centric Comparative Analysis Between Experienced Hand Surgeons Versus Artificial Intelligence.掌腱膜挛缩症的治疗：经验丰富的手外科医生与人工智能之间的多中心比较分析

Diagnostics (Basel). 2025 Feb 28;15(5):587. doi: 10.3390/diagnostics15050587.

Can deepseek and ChatGPT be used in the diagnosis of oral pathologies?DeepSeek和ChatGPT能用于口腔病理学诊断吗？

BMC Oral Health. 2025 Apr 25;25(1):638. doi: 10.1186/s12903-025-06034-x.

Performance of DeepSeek, Qwen 2.5 MAX, and ChatGPT Assisting in Diagnosis of Corneal Eye Diseases, Glaucoma, and Neuro-Ophthalmology Diseases Based on Clinical Case Reports.基于临床病例报告，DeepSeek、通义千问2.5 MAX和ChatGPT在角膜眼病、青光眼和神经眼科疾病诊断中的性能表现。

medRxiv. 2025 Mar 17:2025.03.14.25323836. doi: 10.1101/2025.03.14.25323836.

Use of Multimodal Artificial Intelligence in Surgical Instrument Recognition.多模态人工智能在手术器械识别中的应用。

Bioengineering (Basel). 2025 Jan 15;12(1):72. doi: 10.3390/bioengineering12010072.

DeepSeek in Healthcare: Revealing Opportunities and Steering Challenges of a New Open-Source Artificial Intelligence Frontier.医疗保健领域的DeepSeek：揭示新开源人工智能前沿的机遇与导向挑战

Cureus. 2025 Feb 18;17(2):e79221. doi: 10.7759/cureus.79221. eCollection 2025 Feb.

Comparative Analysis of ChatGPT-4o and Gemini Advanced Performance on Diagnostic Radiology In-Training Exams.ChatGPT-4o与Gemini在放射诊断学培训考试中的性能对比分析

Cureus. 2025 Mar 20;17(3):e80874. doi: 10.7759/cureus.80874. eCollection 2025 Mar.

Comparative analysis of ChatGPT-4o mini, ChatGPT-4o and Gemini Advanced in the treatment of postmenopausal osteoporosis.ChatGPT-4o mini、ChatGPT-4o与Gemini Advanced在绝经后骨质疏松症治疗中的对比分析。

BMC Musculoskelet Disord. 2025 Apr 16;26(1):369. doi: 10.1186/s12891-025-08601-3.

Comparative performance of artificial intelligence models in rheumatology board-level questions: evaluating Google Gemini and ChatGPT-4o.人工智能模型在风湿病委员会级问题中的比较性能：评估 Google Gemini 和 ChatGPT-4o。

Clin Rheumatol. 2024 Nov;43(11):3507-3513. doi: 10.1007/s10067-024-07154-5. Epub 2024 Sep 28.

Artificial intelligence performance in answering multiple-choice oral pathology questions: a comparative analysis.人工智能在回答口腔病理学选择题方面的表现：一项对比分析。

BMC Oral Health. 2025 Apr 15;25(1):573. doi: 10.1186/s12903-025-05926-2.

引用本文的文献

ChatGPT-4.0 or DeepSeek-V3? Comparative analysis of answers to the most frequently asked questions by total knee replacement candidate patients.ChatGPT-4.0还是DeepSeek-V3？全膝关节置换候选患者常见问题答案的比较分析。

Medicine (Baltimore). 2025 Aug 22;104(34):e43951. doi: 10.1097/MD.0000000000043951.

A Comparative Study on the Use of DeepSeek-R1 and ChatGPT-4.5 in Different Aspects of Plastic Surgery.DeepSeek-R1与ChatGPT-4.5在整形外科不同方面应用的比较研究

Aesthetic Plast Surg. 2025 Aug 11. doi: 10.1007/s00266-025-05108-z.

Diagnostic Performance of ChatGPT-4o in Detecting Hip Fractures on Pelvic X-rays.ChatGPT-4o在骨盆X光片检测髋部骨折中的诊断性能

Cureus. 2025 Jun 24;17(6):e86654. doi: 10.7759/cureus.86654. eCollection 2025 Jun.

Management of Burns: Multi-Center Assessment Comparing AI Models and Experienced Plastic Surgeons.烧伤管理：比较人工智能模型与经验丰富的整形外科医生的多中心评估

J Clin Med. 2025 Apr 29;14(9):3078. doi: 10.3390/jcm14093078.

本文引用的文献

Revolutionizing surgery: AI and robotics for precision, risk reduction, and innovation.变革性手术：用于精准、降低风险和创新的人工智能与机器人技术。

J Robot Surg. 2025 Jan 7;19(1):47. doi: 10.1007/s11701-024-02205-0.

Chatbots for breast cancer education: a systematic review and meta-analysis.用于乳腺癌教育的聊天机器人：系统评价与荟萃分析

Support Care Cancer. 2024 Dec 27;33(1):55. doi: 10.1007/s00520-024-09096-9.

The Algorithmic Divide: A Systematic Review on AI-Driven Racial Disparities in Healthcare.算法鸿沟：关于人工智能驱动的医疗保健领域种族差异的系统综述

J Racial Ethn Health Disparities. 2024 Dec 18. doi: 10.1007/s40615-024-02237-0.

Letter on: "Artificial Intelligence: Enhancing Scientific Presentations in Aesthetic Surgery".关于“人工智能：提升美容外科科学演示效果”的信函。

Aesthetic Plast Surg. 2024 Dec 9. doi: 10.1007/s00266-024-04592-z.

Role of Artificial Intelligence and Machine Learning in Facial Aesthetic Surgery: A Systematic Review.人工智能和机器学习在面部美容外科中的作用：系统评价。

Facial Plast Surg Aesthet Med. 2024 Nov-Dec;26(6):679-705. doi: 10.1089/fpsam.2024.0204.

Decoding the Impact of AI on Microsurgery: Systematic Review and Classification of Six Subdomains for Future Development.解读人工智能对显微外科手术的影响：六个子领域的系统评价及未来发展分类

Plast Reconstr Surg Glob Open. 2024 Nov 20;12(11):e6323. doi: 10.1097/GOX.0000000000006323. eCollection 2024 Nov.

Machine Learning, Deep Learning, Artificial Intelligence and Aesthetic Plastic Surgery: A Qualitative Systematic Review.机器学习、深度学习、人工智能与美容整形外科学：一项定性系统综述

Aesthetic Plast Surg. 2025 Jan;49(1):389-399. doi: 10.1007/s00266-024-04421-3. Epub 2024 Oct 9.

Artificial Intelligence in Facial Plastics and Reconstructive Surgery.人工智能在面部整形与重建外科中的应用

Otolaryngol Clin North Am. 2024 Oct;57(5):843-852. doi: 10.1016/j.otc.2024.05.002. Epub 2024 Jul 8.

Applications of artificial intelligence in facial plastic and reconstructive surgery: a systematic review.人工智能在面部整形和重建外科中的应用：系统评价。

Curr Opin Otolaryngol Head Neck Surg. 2024 Aug 1;32(4):222-233. doi: 10.1097/MOO.0000000000000975. Epub 2024 Apr 19.

Artificial Intelligence and Submissions to Annals of Plastic Surgery.人工智能与《整形外科学年鉴》投稿

Ann Plast Surg. 2024 May 1;92(5):487-488. doi: 10.1097/SAP.0000000000003997.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验