• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

踝关节扭伤的诊断、治疗与预防:将免费聊天机器人推荐与临床指南进行比较

Diagnosis, treatment, and prevention of ankle sprains: Comparing free chatbot recommendations with clinical guidelines.

作者信息

Roch Friederike Eva, Hahn Franziska Melanie, Jäckle Katharina, Meier Marc-Pascal, Stinus Hartmut, Lehmann Wolfgang, Perthel Ronny, Roch Paul Jonathan

机构信息

Department of Trauma Surgery, Orthopaedics and Plastic Surgery, University of Göttingen, Robert-Koch-Str. 40, Göttingen 37075, Germany.

出版信息

Foot Ankle Surg. 2025 Jun;31(4):329-351. doi: 10.1016/j.fas.2024.12.003. Epub 2024 Dec 13.

DOI:10.1016/j.fas.2024.12.003
PMID:39730224
Abstract

BACKGROUND

Free chatbots powered by large language models offer lateral ankle sprains (LAS) treatment recommendations but lack scientific validation.

METHODS

The chatbots-Claude, Perplexity, and ChatGPT-were evaluated by comparing their responses to a questionnaire and their treatment algorithms against current clinical guidelines. Responses were graded on accuracy, conclusiveness, supplementary information, and incompleteness, and evaluated individually and collectively, with a 60 % pass threshold.

RESULTS

The collective analysis of the questionnaire showed Perplexity scored significantly higher than Claude and ChatGPT (p < 0.001). In the individual analysis, Perplexity provided significantly more supplementary information than the other chatbots (p < 0.001). All chatbots met the pass threshold. In the algorithm evaluation, ChatGPT scored significantly higher than the others (p = 0.023), with Perplexity below the pass threshold.

CONCLUSIONS

Chatbots' recommendations generally aligned with current guidelines but sometimes missed crucial details. While they offer useful supplementary information, they cannot yet replace professional medical consultation or established guidelines.

摘要

背景

由大语言模型驱动的免费聊天机器人提供外侧踝关节扭伤(LAS)的治疗建议,但缺乏科学验证。

方法

通过将聊天机器人Claude、Perplexity和ChatGPT对问卷的回答及其治疗算法与当前临床指南进行比较来评估它们。回答根据准确性、结论性、补充信息和不完整性进行评分,并分别和综合进行评估,及格阈值为60%。

结果

问卷的综合分析显示,Perplexity的得分显著高于Claude和ChatGPT(p<0.001)。在个体分析中,Perplexity提供的补充信息显著多于其他聊天机器人(p<0.001)。所有聊天机器人均达到及格阈值。在算法评估中,ChatGPT的得分显著高于其他机器人(p=0.023),Perplexity低于及格阈值。

结论

聊天机器人的建议总体上与当前指南一致,但有时会遗漏关键细节。虽然它们提供了有用的补充信息,但尚不能取代专业医疗咨询或既定指南。

相似文献

1
Diagnosis, treatment, and prevention of ankle sprains: Comparing free chatbot recommendations with clinical guidelines.踝关节扭伤的诊断、治疗与预防:将免费聊天机器人推荐与临床指南进行比较
Foot Ankle Surg. 2025 Jun;31(4):329-351. doi: 10.1016/j.fas.2024.12.003. Epub 2024 Dec 13.
2
Accuracy and Readability of Artificial Intelligence Chatbot Responses to Vasectomy-Related Questions: Public Beware.人工智能聊天机器人对输精管切除术相关问题回答的准确性和可读性:公众需谨慎。
Cureus. 2024 Aug 28;16(8):e67996. doi: 10.7759/cureus.67996. eCollection 2024 Aug.
3
Comparative assessment of artificial intelligence chatbots' performance in responding to healthcare professionals' and caregivers' questions about Dravet syndrome.人工智能聊天机器人在回答医疗专业人员和护理人员有关德雷维特综合征问题时的性能比较评估。
Epilepsia Open. 2025 Apr 1. doi: 10.1002/epi4.70022.
4
Accuracy of Prospective Assessments of 4 Large Language Model Chatbot Responses to Patient Questions About Emergency Care: Experimental Comparative Study.前瞻性评估 4 种大型语言模型聊天机器人对患者关于急救护理问题的回答的准确性:实验性对比研究。
J Med Internet Res. 2024 Nov 4;26:e60291. doi: 10.2196/60291.
5
Charting new AI education in gastroenterology: Cross-sectional evaluation of ChatGPT and perplexity AI in medical residency exam.绘制胃肠病学新的人工智能教育图表:ChatGPT 和 perplexity AI 在医学住院医师考试中的横断面评估。
Dig Liver Dis. 2024 Aug;56(8):1304-1311. doi: 10.1016/j.dld.2024.02.019. Epub 2024 Mar 19.
6
The performance of artificial intelligence large language model-linked chatbots in surgical decision-making for gastroesophageal reflux disease.人工智能大语言模型关联型聊天机器人在胃食管反流病手术决策中的应用。
Surg Endosc. 2024 May;38(5):2320-2330. doi: 10.1007/s00464-024-10807-w. Epub 2024 Apr 17.
7
Assessing adult sinusitis guidelines: A comparative analysis of AAO-HNS and AI Chatbots.评估成人鼻窦炎指南:美国耳鼻咽喉头颈外科学会(AAO-HNS)与人工智能聊天机器人的比较分析
Am J Otolaryngol. 2025 Jan-Feb;46(1):104563. doi: 10.1016/j.amjoto.2024.104563. Epub 2025 Jan 29.
8
Diagnosis, treatment and prevention of ankle sprains: an evidence-based clinical guideline.踝关节扭伤的诊断、治疗和预防:循证临床指南。
Br J Sports Med. 2012 Sep;46(12):854-60. doi: 10.1136/bjsports-2011-090490. Epub 2012 Apr 20.
9
Assessment of readability, reliability, and quality of ChatGPT®, BARD®, Gemini®, Copilot®, Perplexity® responses on palliative care.评估 ChatGPT®、BARD®、 Gemini®、Copilot®、Perplexity® 在姑息治疗方面的可读性、可靠性和质量。
Medicine (Baltimore). 2024 Aug 16;103(33):e39305. doi: 10.1097/MD.0000000000039305.
10
Can ChatGPT and Gemini justify brain CT referrals? A comparative study with human experts and a custom prediction model.ChatGPT和Gemini能否证明脑部CT转诊的合理性?与人类专家和定制预测模型的比较研究。
Eur Radiol Exp. 2025 Feb 18;9(1):24. doi: 10.1186/s41747-025-00569-y.

引用本文的文献

1
Artificial Intelligence and Musculoskeletal Surgical Applications.人工智能与肌肉骨骼外科应用
HSS J. 2025 May 20:15563316251339596. doi: 10.1177/15563316251339596.