Hlavinka William J, Sontam Tarun R, Gupta Anuj, Croen Brett J, Abdullah Mohammed S, Humbyrd Casey J
Texas A&M School of Medicine, Baylor University Medical Center, Department of Medical Education, 3500 Gaston Avenue, 6-Roberts, Dallas, TX 75246, USA.
Department of Orthopedic Surgery, University of Pennsylvania Health System, 51 N 39th St, Philadelphia, PA 19104, USA.
Foot Ankle Surg. 2025 Jan;31(1):15-19. doi: 10.1016/j.fas.2024.08.002. Epub 2024 Aug 6.
This study evaluates the accuracy and readability of Google, ChatGPT-3.5, and 4.0 (two versions of an artificial intelligence model) responses to common questions regarding bunion surgery.
A Google search of "bunionectomy" was performed, and the first ten questions under "People Also Ask" were recorded. ChatGPT-3.5 and 4.0 were asked these ten questions individually, and their answers were analyzed using the Flesch-Kincaid Reading Ease and Gunning-Fog Level algorithms.
When compared to Google, ChatGPT-3.5 and 4.0 had a larger word count with 315 ± 39 words (p < .0001) and 294 ± 39 words (p < .0001), respectively. A significant difference was found between ChatGPT-3.5 and 4.0 compared to Google using Flesch-Kincaid Reading Ease (p < .0001).
Our findings demonstrate that ChatGPT provided significantly lengthier responses than Google and there was a significant difference in reading ease. Both platforms exceeded the seventh to eighth-grade reading level of the U.S.
N/A.
本研究评估了谷歌、ChatGPT-3.5和4.0(人工智能模型的两个版本)对有关拇囊炎手术常见问题的回答的准确性和可读性。
在谷歌上搜索“拇囊切除术”,记录“相关问题”下的前十个问题。分别向ChatGPT-3.5和4.0提出这十个问题,并使用弗莱什-金凯德易读性算法和冈宁-福格指数算法分析它们的答案。
与谷歌相比,ChatGPT-3.5和4.0的单词数更多,分别为315±39个单词(p<.0001)和294±39个单词(p<.0001)。使用弗莱什-金凯德易读性算法发现,与谷歌相比,ChatGPT-3.5和4.0之间存在显著差异(p<.0001)。
我们的研究结果表明,ChatGPT提供的回答比谷歌长得多,并且在易读性方面存在显著差异。两个平台都超过了美国七年级到八年级的阅读水平。
无。