Suppr超能文献

大语言模型是解决患者关于拇外翻常见担忧的有用资源吗?一项可读性分析。

Are large language models a useful resource to address common patient concerns on hallux valgus? A readability analysis.

作者信息

Hlavinka William J, Sontam Tarun R, Gupta Anuj, Croen Brett J, Abdullah Mohammed S, Humbyrd Casey J

机构信息

Texas A&M School of Medicine, Baylor University Medical Center, Department of Medical Education, 3500 Gaston Avenue, 6-Roberts, Dallas, TX 75246, USA.

Department of Orthopedic Surgery, University of Pennsylvania Health System, 51 N 39th St, Philadelphia, PA 19104, USA.

出版信息

Foot Ankle Surg. 2025 Jan;31(1):15-19. doi: 10.1016/j.fas.2024.08.002. Epub 2024 Aug 6.

Abstract

BACKGROUND

This study evaluates the accuracy and readability of Google, ChatGPT-3.5, and 4.0 (two versions of an artificial intelligence model) responses to common questions regarding bunion surgery.

METHODS

A Google search of "bunionectomy" was performed, and the first ten questions under "People Also Ask" were recorded. ChatGPT-3.5 and 4.0 were asked these ten questions individually, and their answers were analyzed using the Flesch-Kincaid Reading Ease and Gunning-Fog Level algorithms.

RESULTS

When compared to Google, ChatGPT-3.5 and 4.0 had a larger word count with 315 ± 39 words (p < .0001) and 294 ± 39 words (p < .0001), respectively. A significant difference was found between ChatGPT-3.5 and 4.0 compared to Google using Flesch-Kincaid Reading Ease (p < .0001).

CONCLUSIONS

Our findings demonstrate that ChatGPT provided significantly lengthier responses than Google and there was a significant difference in reading ease. Both platforms exceeded the seventh to eighth-grade reading level of the U.S.

LEVEL OF EVIDENCE

N/A.

摘要

背景

本研究评估了谷歌、ChatGPT-3.5和4.0(人工智能模型的两个版本)对有关拇囊炎手术常见问题的回答的准确性和可读性。

方法

在谷歌上搜索“拇囊切除术”,记录“相关问题”下的前十个问题。分别向ChatGPT-3.5和4.0提出这十个问题,并使用弗莱什-金凯德易读性算法和冈宁-福格指数算法分析它们的答案。

结果

与谷歌相比,ChatGPT-3.5和4.0的单词数更多,分别为315±39个单词(p<.0001)和294±39个单词(p<.0001)。使用弗莱什-金凯德易读性算法发现,与谷歌相比,ChatGPT-3.5和4.0之间存在显著差异(p<.0001)。

结论

我们的研究结果表明,ChatGPT提供的回答比谷歌长得多,并且在易读性方面存在显著差异。两个平台都超过了美国七年级到八年级的阅读水平。

证据水平

无。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验