Riestra-Ayora Juan, Vaduva Cristina, Esteban-Sánchez Jonathan, Garrote-Garrote María, Fernández-Navarro Carlos, Sánchez-Rodríguez Carolina, Martin-Sanz Eduardo
Department of Medicine, Faculty of Biomedical and Health Sciences, Universidad Europea de Madrid, Villaviciosa de Odón, 28670, Madrid, Spain.
Department of Otolaryngology-Head and Neck Surgery, Hospital Universitario de Getafe, Carretera de Toledo, Km 12.500, Getafe, 28905, Madrid, Spain.
Eur Arch Otorhinolaryngol. 2024 Jun;281(6):3253-3259. doi: 10.1007/s00405-024-08581-5. Epub 2024 Mar 4.
ChatGPT (Chat-Generative Pre-trained Transformer) has proven to be a powerful information tool on various topics, including healthcare. This system is based on information obtained on the Internet, but this information is not always reliable. Currently, few studies analyze the validity of these responses in rhinology. Our work aims to assess the quality and reliability of the information provided by AI regarding the main rhinological pathologies.
We asked to the default ChatGPT version (GPT-3.5) 65 questions about the most prevalent pathologies in rhinology. The focus was learning about the causes, risk factors, treatments, prognosis, and outcomes. We use the Discern questionnaire and a hexagonal radar schema to evaluate the quality of the information. We use Fleiss's kappa statistical analysis to determine the consistency of agreement between different observers.
The overall evaluation of the Discern questionnaire resulted in a score of 4.05 (± 0.6). The results in the Reliability section are worse, with an average score of 3.18. (± 1.77). This score is affected by the responses to questions about the source of the information provided. The average score for the Quality section was 3.59 (± 1.18). Fleiss's Kappa shows substantial agreement, with a K of 0.69 (p < 0.001).
The ChatGPT answers are accurate and reliable. It generates a simple and understandable description of the pathology for the patient's benefit. Our team considers that ChatGPT could be a useful tool to provide information under prior supervision by a health professional.
ChatGPT(聊天生成预训练变换器)已被证明是一个关于包括医疗保健在内的各种主题的强大信息工具。该系统基于从互联网上获取的信息,但这些信息并不总是可靠的。目前,很少有研究分析这些在鼻科学方面的回答的有效性。我们的工作旨在评估人工智能提供的关于主要鼻科疾病信息的质量和可靠性。
我们向默认的ChatGPT版本(GPT - 3.5)提出了65个关于鼻科学中最常见疾病的问题。重点是了解病因、风险因素、治疗方法、预后和结果。我们使用辨别问卷和六边形雷达图来评估信息质量。我们使用弗莱iss卡方统计分析来确定不同观察者之间的一致性。
辨别问卷的总体评估得分为4.05(±0.6)。可靠性部分的结果更差,平均得分为3.18(±1.77)。该分数受到关于所提供信息来源问题的回答的影响。质量部分的平均得分为3.59(±1.18)。弗莱iss卡方显示出高度一致性,K值为0.69(p < 0.001)。
ChatGPT的回答准确可靠。它为患者生成了关于疾病的简单易懂的描述。我们的团队认为,ChatGPT在健康专业人员的事先监督下可以成为提供信息的有用工具。