Departments of Orthopaedic Surgery, SUNY Downstate Health Sciences University, College of Medicine, Brooklyn, NY.
Boston University School of Medicine, Boston, MA.
Clin Spine Surg. 2024 Dec 1;37(10):E394-E403. doi: 10.1097/BSD.0000000000001582. Epub 2024 Feb 20.
Retrospective Observational Study.
The objective of this study was to assess the utility of ChatGPT, an artificial intelligence chatbot, in providing patient information for lumbar spinal fusion and lumbar laminectomy in comparison with the Google search engine.
ChatGPT, an artificial intelligence chatbot with seemingly unlimited functionality, may present an alternative to a Google web search for patients seeking information about medical questions. With widespread misinformation and suboptimal quality of online health information, it is imperative to assess ChatGPT as a resource for this purpose.
The first 10 frequently asked questions (FAQs) related to the search terms "lumbar spinal fusion" and "lumbar laminectomy" were extracted from Google and ChatGPT. Responses to shared questions were compared regarding length and readability, using the Flesch Reading Ease score and Flesch-Kincaid Grade Level. Numerical FAQs from Google were replicated in ChatGPT.
Two of 10 (20%) questions for both lumbar spinal fusion and lumbar laminectomy were asked similarly between ChatGPT and Google. Compared with Google, ChatGPT's responses were lengthier (340.0 vs. 159.3 words) and of lower readability (Flesch Reading Ease score: 34.0 vs. 58.2; Flesch-Kincaid grade level: 11.6 vs. 8.8). Subjectively, we evaluated these responses to be accurate and adequately nonspecific. Each response concluded with a recommendation to discuss further with a health care provider. Over half of the numerical questions from Google produced a varying or nonnumerical response in ChatGPT.
FAQs and responses regarding lumbar spinal fusion and lumbar laminectomy were highly variable between Google and ChatGPT. While ChatGPT may be able to produce relatively accurate responses in select questions, its role remains as a supplement or starting point to a consultation with a physician, not as a replacement, and should be taken with caution until its functionality can be validated.
回顾性观察研究。
本研究的目的是评估 ChatGPT(一种具有看似无限功能的人工智能聊天机器人)在提供腰椎融合术和腰椎减压术患者信息方面的效用,与谷歌搜索引擎进行比较。
ChatGPT 是一种具有无限功能的人工智能聊天机器人,可能为寻求医疗问题信息的患者提供了一种替代谷歌网络搜索的方法。由于网络健康信息存在广泛的错误信息和较差的质量,因此评估 ChatGPT 作为一种资源是至关重要的。
从谷歌和 ChatGPT 中提取与搜索词“腰椎融合术”和“腰椎减压术”相关的前 10 个常见问题(FAQ)。使用弗莱什阅读舒适度得分和弗莱什-金凯德年级水平比较共享问题的答案长度和可读性。在 ChatGPT 中复制谷歌的数字 FAQ。
对于腰椎融合术和腰椎减压术,ChatGPT 和谷歌共提出了 2 个(20%)相似的问题。与谷歌相比,ChatGPT 的回答更长(340.0 字对 159.3 字),可读性更低(弗莱什阅读舒适度得分:34.0 对 58.2;弗莱什-金凯德年级水平:11.6 对 8.8)。主观上,我们认为这些回答准确且足够非特定。每个回答都建议与医疗保健提供者进一步讨论。谷歌的一半以上数值问题在 ChatGPT 中产生了不同或非数值的响应。
谷歌和 ChatGPT 之间关于腰椎融合术和腰椎减压术的 FAQ 和回答差异很大。虽然 ChatGPT 可能能够在某些问题上产生相对准确的回答,但它的作用仍然是作为与医生咨询的补充或起点,而不是替代,并且在其功能得到验证之前应谨慎使用。