评估ChatGPT对常见小儿眼科及斜视相关问题的回答效果和可读性。

Evaluating ChatGPT's efficacy and readability to common pediatric ophthalmology and strabismus-related questions.

作者信息

Ahmed H Shafeeq, Thrishulamurthy Chinmayee J

机构信息

Department of Ophthalmology, Bangalore Medical College and Research Institute, Bangalore, India.

出版信息

Eur J Ophthalmol. 2025 Mar;35(2):466-473. doi: 10.1177/11206721241272251. Epub 2024 Aug 7.

DOI:10.1177/11206721241272251

PMID:39110014

Abstract

INTRODUCTION

The rise in popularity of chatbots, particularly ChatGPT by OpenAI among the general public and its utility in the healthcare field is a topic of present controversy. The current study aimed at assessing the reliability and accuracy of ChatGPT's responses to inquiries posed by parents, specifically focusing on a range of pediatric ophthalmological and strabismus conditions.

METHODS

Patient queries were collected via a thematic analysis and posed to ChatGPT 3.5 version across 3 unique instances each. The questions were divided into 12 domains totalling 817 unique questions. All responses were scored on the response quality by two experienced pediatric ophthalmologists in a Likert-scale format. All questions were evaluated for readability using the Flesch-Kincaid Grade Level (FKGL) and character counts.

RESULTS

A total of 638 (78.09%) questions were scored to be perfectly correct, 156 (19.09%) were scored correct but incomplete and only 23 (2.81%) were scored to be partially incorrect. None of the responses were scored to be completely incorrect. Average FKGL score was 14.49 [95% CI 14.4004-14.5854] and the average character count was 1825.33 [95%CI 1791.95-1858.7] with p = 0.831 and 0.697 respectively. The minimum and maximum FKGL scores were 10.6 and 18.34 respectively. FKGL predicted character count, R²=.012, F(1,815) = 10.26, p = .001.

CONCLUSION

ChatGPT provided accurate and reliable information for a majority of the questions. The readability of the questions was much above the typically required standards for adults, which is concerning. Despite these limitations, it is evident that this technology will play a significant role in the healthcare industry.

摘要

引言

聊天机器人的日益普及，尤其是OpenAI的ChatGPT在普通大众中的受欢迎程度及其在医疗保健领域的效用，是当前一个有争议的话题。当前的研究旨在评估ChatGPT对家长提出的问题的回答的可靠性和准确性，特别关注一系列小儿眼科和斜视病症。

方法

通过主题分析收集患者的问题，并分3个独立实例向ChatGPT 3.5版本提出。这些问题被分为12个领域，共计817个独特问题。两位经验丰富的小儿眼科医生以李克特量表的形式对所有回答的质量进行评分。所有问题均使用弗莱什-金凯德年级水平（FKGL）和字符数进行可读性评估。

结果

共有638个（78.09%）问题被评为完全正确，156个（19.09%）被评为正确但不完整，只有23个（2.81%）被评为部分错误。没有一个回答被评为完全错误。平均FKGL分数为14.49 [95%置信区间14.4004 - 14.5854]，平均字符数为1825.33 [95%置信区间1791.95 - 1858.7]，p值分别为0.831和0.697。FKGL的最低和最高分数分别为10.6和18.34。FKGL预测字符数，R² = 0.012，F(1,815) = 10.26，p = 0.001。

结论

ChatGPT为大多数问题提供了准确可靠的信息。问题的可读性远高于成年人通常所需的标准，这令人担忧。尽管存在这些局限性，但很明显这项技术将在医疗行业发挥重要作用。

相似文献

Evaluating ChatGPT's efficacy and readability to common pediatric ophthalmology and strabismus-related questions.评估ChatGPT对常见小儿眼科及斜视相关问题的回答效果和可读性。

Eur J Ophthalmol. 2025 Mar;35(2):466-473. doi: 10.1177/11206721241272251. Epub 2024 Aug 7.

Assessing the Quality and Reliability of ChatGPT's Responses to Radiotherapy-Related Patient Queries: Comparative Study With GPT-3.5 and GPT-4.评估ChatGPT对放疗相关患者问题回答的质量和可靠性：与GPT-3.5和GPT-4的比较研究

JMIR Cancer. 2025 Apr 16;11:e63677. doi: 10.2196/63677.

Accuracy and Readability of Artificial Intelligence Chatbot Responses to Vasectomy-Related Questions: Public Beware.人工智能聊天机器人对输精管切除术相关问题回答的准确性和可读性：公众需谨慎。

Cureus. 2024 Aug 28;16(8):e67996. doi: 10.7759/cureus.67996. eCollection 2024 Aug.

Acceptability and readability of ChatGPT-4 based responses for frequently asked questions about strabismus and amblyopia.基于ChatGPT-4的斜视和弱视常见问题回答的可接受性与可读性。

J Fr Ophtalmol. 2025 Mar;48(3):104400. doi: 10.1016/j.jfo.2024.104400. Epub 2024 Dec 20.

Evaluating the accuracy and readability of ChatGPT in providing parental guidance for adenoidectomy, tonsillectomy, and ventilation tube insertion surgery.评估 ChatGPT 在提供腺样体切除术、扁桃体切除术和通气管插入手术的家长指导方面的准确性和可读性。

Int J Pediatr Otorhinolaryngol. 2024 Jun;181:111998. doi: 10.1016/j.ijporl.2024.111998. Epub 2024 May 31.

Generative artificial intelligence chatbots may provide appropriate informational responses to common vascular surgery questions by patients.生成式人工智能聊天机器人可能会为患者关于常见血管外科问题提供恰当的信息性回复。

Vascular. 2025 Feb;33(1):229-237. doi: 10.1177/17085381241240550. Epub 2024 Mar 18.

Performance of Artificial Intelligence Chatbots on Glaucoma Questions Adapted From Patient Brochures.人工智能聊天机器人对改编自患者手册的青光眼问题的回答情况。

Cureus. 2024 Mar 23;16(3):e56766. doi: 10.7759/cureus.56766. eCollection 2024 Mar.

Accuracy and Readability of ChatGPT Responses to Patient-Centric Strabismus Questions.ChatGPT对以患者为中心的斜视问题的回答的准确性和可读性。

J Pediatr Ophthalmol Strabismus. 2025 May-Jun;62(3):220-227. doi: 10.3928/01913913-20250110-02. Epub 2025 Feb 19.

Evaluation of the reliability, usefulness, quality and readability of ChatGPT's responses on Scoliosis.评估ChatGPT对脊柱侧弯问题回答的可靠性、实用性、质量和可读性。

Eur J Orthop Surg Traumatol. 2025 Mar 18;35(1):123. doi: 10.1007/s00590-025-04198-4.

Artificial intelligence insights into osteoporosis: assessing ChatGPT's information quality and readability.人工智能在骨质疏松症中的应用：评估 ChatGPT 的信息质量和可读性。

Arch Osteoporos. 2024 Mar 19;19(1):17. doi: 10.1007/s11657-024-01376-5.

引用本文的文献

Opportunities and Challenges of Chatbots in Ophthalmology: A Narrative Review.眼科领域中聊天机器人的机遇与挑战：一篇叙述性综述

J Pers Med. 2024 Dec 21;14(12):1165. doi: 10.3390/jpm14121165.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

评估ChatGPT对常见小儿眼科及斜视相关问题的回答效果和可读性。

Evaluating ChatGPT's efficacy and readability to common pediatric ophthalmology and strabismus-related questions.

作者信息

机构信息

出版信息

INTRODUCTION

METHODS

RESULTS

CONCLUSION

引言

方法

结果

结论

相似文献

引用本文的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献