Mondal Himel, De Rajesh, Mondal Shaikat, Juhi Ayesha
Department of Physiology, All India Institute of Medical Sciences, Deoghar, Jharkhand, India.
Department of Community Medicine, Malda Medical College and Hospital, Malda, West Bengal, India.
J Educ Health Promot. 2024 Sep 28;13:362. doi: 10.4103/jehp.jehp_688_23. eCollection 2024.
Access to quality health care is essential, particularly in remote areas where the availability of healthcare professionals may be limited. The advancement of artificial intelligence (AI) and natural language processing (NLP) has led to the development of large language models (LLMs) that exhibit capabilities in understanding and generating human-like text. This study aimed to evaluate the performance of a LLM, ChatGPT, in addressing primary healthcare issues.
This study was conducted in May 2023 with ChatGPT May 12 version. A total of 30 multiple-choice questions (MCQs) related to primary health care were selected to test the proficiency of ChatGPT. These MCQs covered various topics commonly encountered in primary healthcare practice. ChatGPT answered the questions in two segments-one is choosing the single best answer of MCQ and another is supporting text for the answer. The answers to MCQs were compared with the predefined answer keys. The justifications of the answers were checked by two primary healthcare professionals on a 5-point Likert-type scale. The data were presented as number and percentage.
Among the 30 questions, ChatGPT provided correct responses for 28 yielding an accuracy of 93.33%. The mean score for explanation in supporting the answer was 4.58 ± 0.85. There was an inter-item correlation of 0.896, and the average measure intraclass correlation coefficient (ICC) was 0.94 (95% confidence interval 0.88-0.97) indicating a high level of interobserver agreement.
LLMs, such as ChatGPT, show promising potential in addressing primary healthcare issues. The high accuracy rate achieved by ChatGPT in answering primary healthcare-related MCQs underscores the value of these models as resources for patients and healthcare providers in remote healthcare settings. This can also help in self-directed learning by medical students.
获得优质医疗保健至关重要,尤其是在医疗专业人员供应可能有限的偏远地区。人工智能(AI)和自然语言处理(NLP)的发展催生了大型语言模型(LLMs),这些模型在理解和生成类人文本方面展现出能力。本研究旨在评估大型语言模型ChatGPT在解决初级医疗保健问题方面的表现。
本研究于2023年5月使用ChatGPT 5月12日版本进行。共选择了30道与初级医疗保健相关的多项选择题(MCQs)来测试ChatGPT的能力。这些多项选择题涵盖了初级医疗保健实践中常见的各种主题。ChatGPT分两部分回答问题——一是选择多项选择题的最佳单一答案,二是为答案提供支持文本。将多项选择题的答案与预先定义的答案键进行比较。由两名初级医疗保健专业人员以5点李克特量表对答案的理由进行检查。数据以数量和百分比形式呈现。
在30道问题中,ChatGPT给出了28道正确答案,准确率为93.33%。答案支持解释的平均得分为4.58±0.85。项间相关性为0.896,平均组内相关系数(ICC)为0.94(95%置信区间0.88 - 0.97),表明观察者间一致性水平较高。
ChatGPT等大型语言模型在解决初级医疗保健问题方面显示出有前景的潜力。ChatGPT在回答与初级医疗保健相关的多项选择题时所达到的高准确率凸显了这些模型作为偏远医疗环境中患者和医疗服务提供者资源的价值。这也有助于医学生进行自主学习。