Sensoy Eyupcan, Citirik Mehmet
Ankara Etlik City Hospital, Ankara, Turkey.
Int Ophthalmol. 2023 Dec;43(12):4905-4909. doi: 10.1007/s10792-023-02893-x. Epub 2023 Oct 26.
This study aimed to test the knowledge levels of ChatGPT, Bing, and Bard artificial intelligence chatbots, which have been released by three different manufacturers, about ophthalmic pathologies and intraocular tumors, to test their usability and to investigate the presence of superiority to each other.
Thirty-six questions were obtained from the American Academy and Ophthalmology 2022-2023 Basic and Clinical Science Course Ophthalmic Pathology and Intraocular Tumor study questions section. Each question was asked separately for the ChatGPT, Bing, and Bard artificial intelligence programs. Answers to the questions were categorized as correct or incorrect. The statistical relationship between the correct and incorrect response rates of the artificial intelligence programs was determined.
From the artificial intelligence chatbots, ChatGPT gave the correct answer to 58.6% of the questions asked, Bing gave the correct answer to 63.9%, and Bard gave the correct answer to 69.4%. No statistical significance was found between the rates of correct answers to the questions in all 3 artificial intelligence programs (p = 0.705, Pearson Chi-square test).
Artificial intelligence chatbots can be used to access information related to ophthalmic pathologies and intraocular tumors. However, in the evaluation of the data, it should be noted that not all questions can be answered correctly. Care should be taken when examining the answers.
本研究旨在测试由三家不同制造商发布的ChatGPT、必应(Bing)和巴德(Bard)人工智能聊天机器人关于眼科病理学和眼内肿瘤的知识水平,测试它们的可用性,并调查它们之间是否存在优势差异。
从美国眼科学会2022 - 2023年基础与临床科学课程眼科病理学和眼内肿瘤学习问题部分获取了36个问题。每个问题分别向ChatGPT、必应和巴德人工智能程序提问。问题答案分为正确或错误。确定了人工智能程序正确和错误回答率之间的统计关系。
在人工智能聊天机器人中,ChatGPT对所提问题的正确回答率为58.6%,必应为63.9%,巴德为69.4%。在所有3个人工智能程序中,问题的正确答案率之间未发现统计学意义(p = 0.705,Pearson卡方检验)。
人工智能聊天机器人可用于获取与眼科病理学和眼内肿瘤相关的信息。然而,在评估数据时,应注意并非所有问题都能得到正确回答。查看答案时应谨慎。