Mastrokostas Paul G, Mastrokostas Leonidas E, Emara Ahmed K, Wellington Ian J, Ginalis Elizabeth, Houten John K, Khalsa Amrit S, Saleh Ahmed, Razi Afshin E, Ng Mitchell K
College of Medicine, State University of New York (SUNY) Downstate, Brooklyn, NY, USA.
Brooklyn College of the City University of New York, Brooklyn, NY, USA.
Global Spine J. 2024 Nov;14(8):2389-2398. doi: 10.1177/21925682241241241. Epub 2024 Mar 21.
Comparative study.
This study aims to compare Google and GPT-4 in terms of (1) question types, (2) response readability, (3) source quality, and (4) numerical response accuracy for the top 10 most frequently asked questions (FAQs) about anterior cervical discectomy and fusion (ACDF).
"Anterior cervical discectomy and fusion" was searched on Google and GPT-4 on December 18, 2023. Top 10 FAQs were classified according to the Rothwell system. Source quality was evaluated using benchmark criteria and readability was assessed using Flesch Reading Ease and Flesch-Kincaid grade level. Differences in scores, Flesch-Kincaid grade level, Flesch Reading Ease, and word count between platforms were analyzed using Student's t-tests. Statistical significance was set at the .05 level.
Frequently asked questions from Google were varied, while GPT-4 focused on technical details and indications/management. GPT-4 showed a higher Flesch-Kincaid grade level (12.96 vs 9.28, = .003), lower Flesch Reading Ease score (37.07 vs 54.85, = .005), and higher scores for source quality (3.333 vs 1.800, = .016). Numerically, 6 out of 10 responses varied between platforms, with GPT-4 providing broader recovery timelines for ACDF.
This study demonstrates GPT-4's ability to elevate patient education by providing high-quality, diverse information tailored to those with advanced literacy levels. As AI technology evolves, refining these tools for accuracy and user-friendliness remains crucial, catering to patients' varying literacy levels and information needs in spine surgery.
比较研究。
本研究旨在就(1)问题类型、(2)回答可读性、(3)来源质量以及(4)关于颈椎前路椎间盘切除融合术(ACDF)的前10个最常见问题(FAQs)的数值回答准确性,对谷歌和GPT-4进行比较。
2023年12月18日在谷歌和GPT-4上搜索“颈椎前路椎间盘切除融合术”。根据罗斯韦尔系统对前10个常见问题进行分类。使用基准标准评估来源质量,并使用弗莱什易读性和弗莱什-金凯德年级水平评估可读性。使用学生t检验分析平台之间在得分、弗莱什-金凯德年级水平、弗莱什易读性和单词数方面的差异。统计学显著性设定为0.05水平。
谷歌的常见问题多种多样,而GPT-4专注于技术细节和适应症/管理。GPT-4的弗莱什-金凯德年级水平更高(12.96对9.28,P = 0.003),弗莱什易读性得分更低(37.07对54.85,P = 0.005),来源质量得分更高(3.333对1.800,P = 0.016)。在数值方面,10个回答中有6个在平台之间存在差异,GPT-4为ACDF提供了更广泛的恢复时间线。
本研究证明了GPT-4通过为具有较高识字水平的人提供高质量、多样化信息来提高患者教育水平的能力。随着人工智能技术的发展,提高这些工具的准确性和用户友好性仍然至关重要,以满足脊柱手术患者不同的识字水平和信息需求。