Arora Shiv, Ramesh Meghna, Moe Aye Thandar, Giri Tapan, Parrikh Kaksha, Challa Hima Varsha
General Surgery, Sardar Patel Medical College, Bikaner, IND.
Internal Medicine, Asian Institute of Gastroenterology Hospitals, Hyderabad, IND.
Cureus. 2024 Nov 7;16(11):e73212. doi: 10.7759/cureus.73212. eCollection 2024 Nov.
Introduction Epilepsy is a chronic disorder that requires patient education for management and to avoid triggers and complications. This study aims to evaluate and compare the effectiveness of two artificial intelligence (AI) tools, ChatGPT (version 3.5, OpenAI, Inc., San Francisco, United States) and Google Gemini (version 1.5, Google LLC, Mountain View, California, United States), in generating patient education guides for epilepsy disorders. Methodology A patient education guide was generated on ChatGPT and Google Gemini. The study analyzed the sentence count, readability, and ease of understanding using the Flesch-Kincaid calculator, examined similarity using the QuillBot plagiarism tool, and assessed reliability using a modified DISCERN score. Statistical analysis included an unpaired T-test where a P-value <0.05 is considered significant. Results There was no statistically significant difference between ChatGPT and Google Gemini in terms of word count (p=0.75), sentence count (p=0.96), average words per sentence (p=0.66), grade level (p=0.67), similarity% (p=0.57), and reliability scores (p=0.42). Ease scores generated by ChatGPT and Google Gemini were 38.6 and 43.6 for generalized tonic-clonic seizures (GTCS), 18.7 and 45.5 for myoclonic seizures, and 22.4 and 55.8 for status epilepticus, respectively, showing Google Gemini generated responses notably better (p=0.0493). The average syllables per word (p=0.035) were appreciably lower for Google Gemini-generated responses, with 1.8 for GTCS and myoclonic, 1.7 for status epilepticus against 1.9 for GTCS, 2 for myoclonic, and 2.1 for status epilepticus for ChatGPT responses. Conclusions A significant difference was seen in only two parameters. Further improvement in AI tools is necessary to provide effective guides.
引言 癫痫是一种慢性疾病,需要对患者进行教育以进行管理并避免触发因素和并发症。本研究旨在评估和比较两种人工智能(AI)工具ChatGPT(版本3.5,OpenAI公司,美国旧金山)和谷歌Gemini(版本1.5,谷歌有限责任公司,美国加利福尼亚山景城)在生成癫痫疾病患者教育指南方面的有效性。
方法 在ChatGPT和谷歌Gemini上生成一份患者教育指南。该研究使用弗莱什-金凯德计算器分析句子数量、可读性和易理解性,使用QuillBot抄袭工具检查相似度,并使用修改后的DISCERN评分评估可靠性。统计分析包括未配对t检验,其中P值<0.05被认为具有统计学意义。
结果 ChatGPT和谷歌Gemini在字数(p=0.75)、句子数量(p=0.96)、平均每句字数(p=0.66)、年级水平(p=0.67)、相似度百分比(p=0.57)和可靠性评分(p=0.42)方面没有统计学显著差异。ChatGPT和谷歌Gemini生成的简易评分在全面性强直阵挛发作(GTCS)中分别为38.6和43.6,在肌阵挛发作中分别为18.7和45.5,在癫痫持续状态中分别为22.4和55.8,表明谷歌Gemini生成的回答明显更好(p=0.0493)。谷歌Gemini生成的回答平均每个单词的音节数(p=0.035)明显更低,在GTCS和肌阵挛发作中为1.8,在癫痫持续状态中为1.7,而ChatGPT的回答在GTCS中为1.9,在肌阵挛发作中为2,在癫痫持续状态中为2.1。
结论 仅在两个参数上存在显著差异。人工智能工具需要进一步改进以提供有效的指南。