Suppr超能文献

评估聊天机器人对眼内炎常见问题回答的可靠性和可读性:一项关于聊天机器人的横断面研究。

Evaluation of the reliability and readability of answers given by chatbots to frequently asked questions about endophthalmitis: A cross-sectional study on chatbots.

机构信息

Department of Ophthalmology, Adana 5 Ocak State Hospital, Adana, Turkey.

出版信息

Health Informatics J. 2024 Oct-Dec;30(4):14604582241304679. doi: 10.1177/14604582241304679.

Abstract

This study aimed to investigate the accuracy, reliability, and readability of A-Eye Consult, ChatGPT-4.0, Google Gemini and Copilot AI large language models (LLMs) in responding to patient questions about endophthalmitis. The LLMs' responses to 25 questions about endophthalmitis, frequently asked by patients, were evaluated by two ophthalmologists using a five-point Likert scale, with scores ranging from 1-5. The DISCERN scale assessed the reliability of the LLMs' responses, whereas the Flesch Reading Ease (FRE) and Flesch-Kincaid Grade Level (FKGL) indices assessed readability and text complexity, respectively. A-Eye Consult and ChatGPT-4.0 outperformed Google Gemini and Copilot in providing comprehensive and precise responses. The Likert score significantly differed across all four LLMs ( < .001), with A-Eye Consult scoring significantly higher than Google Gemini and Copilot ( < .001). A-Eye Consult and ChatGPT-4.0 responses, while more complex than those of other LLMs, provided more reliable and accurate information.

摘要

本研究旨在探讨 A-Eye Consult、ChatGPT-4.0、Google Gemini 和 Copilot AI 大型语言模型(LLM)在回答患者有关眼内炎问题时的准确性、可靠性和可读性。两位眼科医生使用五点李克特量表对 LLM 对 25 个关于眼内炎的问题的回答进行了评估,分数范围为 1-5。DISCERN 量表评估了 LLM 回答的可靠性,而 Flesch 阅读容易度(FRE)和 Flesch-Kincaid 年级水平(FKGL)指数分别评估了可读性和文本复杂性。A-Eye Consult 和 ChatGPT-4.0 在提供全面和准确的回答方面优于 Google Gemini 和 Copilot。所有四个 LLM 的李克特评分均存在显著差异( <.001),A-Eye Consult 的评分明显高于 Google Gemini 和 Copilot( <.001)。A-Eye Consult 和 ChatGPT-4.0 的回答虽然比其他 LLM 更复杂,但提供了更可靠和准确的信息。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验