Suppr
超能文献

评估人工智能驱动的聊天机器人在解答春季结膜炎相关问题方面的效果。

Evaluating the Efficacy of Artificial Intelligence-Driven Chatbots in Addressing Queries on Vernal Conjunctivitis.

作者信息

Saad Muhammad, Moqeet Muhammad A, Mansoor Hassan, Khan Shama, Sharif Rabia, Khan Fahim Ullah, Naqvi Ali H, Ali Warda

机构信息

Ophthalmology, Al-Shifa Trust Eye Hospital, Rawalpindi, PAK.

Cornea and Refractive Surgery, Al-Shifa Trust Eye Hospital, Rawalpindi, PAK.

出版信息

Cureus. 2025 Feb 26;17(2):e79688. doi: 10.7759/cureus.79688. eCollection 2025 Feb.

DOI:10.7759/cureus.79688

PMID:40161163

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11951947/

Abstract

Background Vernal keratoconjunctivitis (VKC) is a recurrent allergic eye disease that requires accurate patient education to ensure proper management. AI-driven chatbots, such as Google Gemini Advanced (Mountain View, California, US), are increasingly being explored as potential tools for providing medical information. This study evaluates the accuracy, reliability, and clinical applicability of Google Gemini Advanced in addressing VKC-related queries. Objective To assess the performance of Google Gemini Advanced in delivering medically accurate and relevant information about VKC and to evaluate its reliability based on expert ratings. Methods A total of 125 responses generated by Google Gemini Advanced for 25 VKC-related questions were assessed by two independent cornea specialists. Responses were rated on accuracy, completeness, and potential harm using a 5-point Likert scale (1-5). Inter-rater reliability was measured using Cronbach's alpha. Responses were categorized into highly accurate (score of 5), minor inconsistencies (score of 4), and inaccurate (scores 1-3). Results Google Gemini Advanced demonstrated high inter-rater reliability (Cronbach's alpha = 0.92, 95% CI: 0.87-0.94). Of the 125 responses, 108 (86.4%) were rated highly accurate (score of 5) while 17 (13.6%) had minor inconsistencies (score of 4) but posed no potential for harm. No responses were classified as inaccurate or potentially harmful. The combined mean score was 4.88 ± 0.31, reflecting strong agreement between raters. The chatbot consistently provided reliable information across diagnostic, treatment, and prognosis-related queries, with minor gaps in complex grading and treatment-related discussions. Discussion The findings support the use of AI-driven chatbots like Google Gemini Advanced as potential tools for patient education in ophthalmology. The chatbot exhibited strong accuracy and consistency, particularly in addressing general VKC-related queries. However, areas for improvement remain, especially in providing detailed guidance on treatment protocols and ensuring completeness in responses to complex clinical questions. Conclusion Google Gemini Advanced demonstrates high reliability and accuracy in delivering medical information about VKC, making it a valuable tool for patient education. While its responses are consistent and generally accurate, expert oversight remains necessary to refine AI-generated content for clinical applications. Further research is needed to enhance AI-driven chatbots' ability to provide nuanced medical advice and integrate them safely into ophthalmic patient education and clinical decision-making.

摘要

背景

春季角结膜炎（VKC）是一种复发性过敏性眼病，需要对患者进行准确的教育以确保妥善管理。诸如谷歌Gemini Advanced（美国加利福尼亚州山景城）之类的人工智能驱动的聊天机器人正越来越多地被探索作为提供医疗信息的潜在工具。本研究评估了谷歌Gemini Advanced在回答VKC相关问题方面的准确性、可靠性和临床适用性。

目的

评估谷歌Gemini Advanced在提供有关VKC的医学准确且相关信息方面的表现，并根据专家评分评估其可靠性。

方法

由谷歌Gemini Advanced针对25个VKC相关问题生成的总共125个回答由两名独立的角膜专家进行评估。使用5点李克特量表（1 - 5）对回答的准确性、完整性和潜在危害进行评分。使用克朗巴哈系数测量评分者间信度。回答被分为高度准确（得分5）、轻微不一致（得分4）和不准确（得分1 - 3）。

结果

谷歌Gemini Advanced显示出较高的评分者间信度（克朗巴哈系数 = 0.92，95%置信区间：0.87 - 0.94）。在125个回答中，108个（86.4%）被评为高度准确（得分5），而17个（13.6%）有轻微不一致（得分4）但没有潜在危害。没有回答被归类为不准确或有潜在危害。综合平均得分为4.88 ± 0.31，反映出评分者之间的高度一致性。该聊天机器人在诊断、治疗和预后相关问题上始终提供可靠信息，在复杂分级和治疗相关讨论方面存在一些小差距。

讨论

研究结果支持将诸如谷歌Gemini Advanced之类的人工智能驱动的聊天机器人作为眼科患者教育的潜在工具。该聊天机器人表现出很强的准确性和一致性，特别是在回答一般VKC相关问题方面。然而，仍有改进的空间，特别是在提供治疗方案的详细指导以及确保对复杂临床问题的回答完整方面。

结论

谷歌Gemini Advanced在提供有关VKC的医学信息方面显示出高可靠性和准确性，使其成为患者教育的有价值工具。虽然其回答一致且总体准确，但仍需要专家监督以完善人工智能生成的内容用于临床应用。需要进一步研究以提高人工智能驱动的聊天机器人提供细致入微的医学建议的能力，并将它们安全地整合到眼科患者教育和临床决策中。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c063/11951947/994932e2f124/cureus-0017-00000079688-i01.jpg

相似文献

Evaluating the Efficacy of Artificial Intelligence-Driven Chatbots in Addressing Queries on Vernal Conjunctivitis.

Cureus. 2025 Feb 26;17(2):e79688. doi: 10.7759/cureus.79688. eCollection 2025 Feb.

Performance of AI-Chatbots to Common Temporomandibular Joint Disorders (TMDs) Patient Queries: Accuracy, Completeness, Reliability and Readability.

Orthod Craniofac Res. 2025 May 7. doi: 10.1111/ocr.12939.

Performance of Artificial Intelligence Chatbots in Responding to Patient Queries Related to Traumatic Dental Injuries: A Comparative Study.

Dent Traumatol. 2025 Jun;41(3):338-347. doi: 10.1111/edt.13020. Epub 2024 Nov 22.

Are Large Language Model-Based Chatbots Effective in Providing Reliable Medical Advice for Achilles Tendinopathy? An International Multispecialist Evaluation.

Orthop J Sports Med. 2025 Apr 30;13(4):23259671251332596. doi: 10.1177/23259671251332596. eCollection 2025 Apr.

Evaluation of validity and reliability of AI Chatbots as public sources of information on dental trauma.

Dent Traumatol. 2025 Apr;41(2):187-193. doi: 10.1111/edt.13000. Epub 2024 Oct 17.

A Comparative Analysis of Artificial Intelligence Platforms: ChatGPT-4o and Google Gemini in Answering Questions About Birth Control Methods.

Cureus. 2025 Jan 1;17(1):e76745. doi: 10.7759/cureus.76745. eCollection 2025 Jan.

Debunking Palliative Care Myths: Assessing the Performance of Artificial Intelligence Chatbots (ChatGPT vs. Google Gemini).

Indian J Palliat Care. 2024 Jul-Sep;30(3):284-287. doi: 10.25259/IJPC_44_2024. Epub 2024 Aug 9.

Accuracy of Prospective Assessments of 4 Large Language Model Chatbot Responses to Patient Questions About Emergency Care: Experimental Comparative Study.

J Med Internet Res. 2024 Nov 4;26:e60291. doi: 10.2196/60291.

Evaluación de la fiabilidad y legibilidad de las respuestas de los chatbots como recurso de información al paciente para las exploraciones PET-TC más communes.

Rev Esp Med Nucl Imagen Mol (Engl Ed). 2025 Jan-Feb;44(1):500065. doi: 10.1016/j.remnie.2024.500065. Epub 2024 Sep 28.

Performance of the ChatGPT-3.5, ChatGPT-4, and Google Gemini large language models in responding to dental implantology inquiries.

J Prosthet Dent. 2025 Jan 4. doi: 10.1016/j.prosdent.2024.12.016.

本文引用的文献

Medical artificial intelligence for clinicians: the lost cognitive perspective.

Lancet Digit Health. 2024 Aug;6(8):e589-e594. doi: 10.1016/S2589-7500(24)00095-5.

Reliability of large language models for advanced head and neck malignancies management: a comparison between ChatGPT 4 and Gemini Advanced.

Eur Arch Otorhinolaryngol. 2024 Sep;281(9):5001-5006. doi: 10.1007/s00405-024-08746-2. Epub 2024 May 25.

A Beginner's Guide to Artificial Intelligence for Ophthalmologists.

Ophthalmol Ther. 2024 Jul;13(7):1841-1855. doi: 10.1007/s40123-024-00958-3. Epub 2024 May 11.

Redefining Healthcare With Artificial Intelligence (AI): The Contributions of ChatGPT, Gemini, and Co-pilot.

Cureus. 2024 Apr 7;16(4):e57795. doi: 10.7759/cureus.57795. eCollection 2024 Apr.

Artificial intelligence and allergic rhinitis: does ChatGPT increase or impair the knowledge?

J Public Health (Oxf). 2024 Feb 23;46(1):123-126. doi: 10.1093/pubmed/fdad219.

Artificial Intelligence-Based ChatGPT Responses for Patient Questions on Optic Disc Drusen.

Ophthalmol Ther. 2023 Dec;12(6):3109-3119. doi: 10.1007/s40123-023-00800-2. Epub 2023 Sep 12.

ChatGPT versus human in generating medical graduate exam multiple choice questions-A multinational prospective study (Hong Kong S.A.R., Singapore, Ireland, and the United Kingdom).

PLoS One. 2023 Aug 29;18(8):e0290691. doi: 10.1371/journal.pone.0290691. eCollection 2023.

Artificial intelligence-based ChatGPT chatbot responses for patient and parent questions on vernal keratoconjunctivitis.

Graefes Arch Clin Exp Ophthalmol. 2023 Oct;261(10):3041-3043. doi: 10.1007/s00417-023-06078-1. Epub 2023 May 2.

Novel Insights in the Management of Vernal Keratoconjunctivitis (VKC): European Expert Consensus Using a Modified Nominal Group Technique.

Ophthalmol Ther. 2023 Apr;12(2):1207-1222. doi: 10.1007/s40123-023-00665-5. Epub 2023 Feb 15.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

Suppr超能文献

评估人工智能驱动的聊天机器人在解答春季结膜炎相关问题方面的效果。

Evaluating the Efficacy of Artificial Intelligence-Driven Chatbots in Addressing Queries on Vernal Conjunctivitis.

作者信息

机构信息

出版信息

背景

目的

方法

结果

讨论

结论

相似文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译