University of California, San Francisco School of Medicine, San Francisco, CA, United States.
McGovern Medical School, University of Texas Health Science Center at Houston, Houston, CA, United States.
J Med Internet Res. 2024 Aug 15;26:e52401. doi: 10.2196/52401.
We queried ChatGPT (OpenAI) and Google Assistant about amblyopia and compared their answers with the keywords found on the American Association for Pediatric Ophthalmology and Strabismus (AAPOS) website, specifically the section on amblyopia. Out of the 26 keywords chosen from the website, ChatGPT included 11 (42%) in its responses, while Google included 8 (31%).
Our study investigated the adherence of ChatGPT-3.5 and Google Assistant to the guidelines of the AAPOS for patient education on amblyopia.
ChatGPT-3.5 was used. The four questions taken from the AAPOS website, specifically its glossary section for amblyopia, are as follows: (1) What is amblyopia? (2) What causes amblyopia? (3) How is amblyopia treated? (4) What happens if amblyopia is untreated? Approved and selected by ophthalmologists (GW and DL), the keywords from AAPOS were words or phrases that deemed significant for the education of patients with amblyopia. The "Flesch-Kincaid Grade Level" formula, approved by the US Department of Education, was used to evaluate the reading comprehension level for the responses from ChatGPT, Google Assistant, and AAPOS.
In their responses, ChatGPT did not mention the term "ophthalmologist," whereas Google Assistant and AAPOS both mentioned the term once and twice, respectively. ChatGPT did, however, use the term "eye doctors" once. According to the Flesch-Kincaid test, the average reading level of AAPOS was 11.4 (SD 2.1; the lowest level) while that of Google was 13.1 (SD 4.8; the highest required reading level), also showing the greatest variation in grade level in its responses. ChatGPT's answers, on average, scored 12.4 (SD 1.1) grade level. They were all similar in terms of difficulty level in reading. For the keywords, out of the 4 responses, ChatGPT used 42% (11/26) of the keywords, whereas Google Assistant used 31% (8/26).
ChatGPT trains on texts and phrases and generates new sentences, while Google Assistant automatically copies website links. As ophthalmologists, we should consider including "see an ophthalmologist" on our websites and journals. While ChatGPT is here to stay, we, as physicians, need to monitor its answers.
我们向 ChatGPT(OpenAI)和 Google Assistant 询问了关于弱视的问题,并将他们的答案与美国儿科学会眼科与斜视分会(AAPOS)网站上的关键词进行了比较,特别是弱视部分。在从网站上选择的 26 个关键词中,ChatGPT 在其回答中包含了 11 个(42%),而 Google 包含了 8 个(31%)。
我们的研究调查了 ChatGPT-3.5 和 Google Assistant 对 AAPOS 关于弱视患者教育指南的遵循情况。
使用 ChatGPT-3.5。从 AAPOS 网站上选取了四个问题,特别是其弱视词汇表部分,分别是:(1)什么是弱视?(2)弱视的原因是什么?(3)弱视如何治疗?(4)如果不治疗弱视会怎样?眼科医生(GW 和 DL)认可并选择了 AAPOS 的关键词,这些关键词是认为对弱视患者教育有重要意义的单词或短语。美国教育部认可的“Flesch-Kincaid 阅读水平测试”公式用于评估 ChatGPT、Google Assistant 和 AAPOS 回答的阅读理解水平。
在他们的回答中,ChatGPT 没有提到“眼科医生”一词,而 Google Assistant 和 AAPOS 分别提到了一次和两次。然而,ChatGPT 确实使用了一次“眼科医生”一词。根据 Flesch-Kincaid 测试,AAPOS 的平均阅读水平为 11.4(标准差 2.1;最低水平),而 Google 的为 13.1(标准差 4.8;最高要求的阅读水平),其回答的阅读水平变化也最大。ChatGPT 的平均回答得分 12.4(标准差 1.1)。它们在阅读难度水平上都相似。对于关键词,在 4 个回答中,ChatGPT 使用了 42%(11/26)的关键词,而 Google Assistant 使用了 31%(8/26)。
ChatGPT 通过训练文本和短语生成新的句子,而 Google Assistant 则自动复制网站链接。作为眼科医生,我们应该考虑在我们的网站和期刊上添加“看眼科医生”。虽然 ChatGPT 已经存在,但作为医生,我们需要监控它的回答。