探讨创新型人工智能聊天机器人对后疫情时代医学教育和临床辅助的影响：全面分析。

Investigating the impact of innovative AI chatbot on post-pandemic medical education and clinical assistance: a comprehensive analysis.

机构信息

Department of Surgery, Peninsula Health, Melbourne, Victoria, Australia.

Faculty of Science, Medicine, and Health, Monash University, Melbourne, Victoria, Australia.

出版信息

ANZ J Surg. 2024 Feb;94(1-2):68-77. doi: 10.1111/ans.18666. Epub 2023 Aug 21.

DOI:10.1111/ans.18666

PMID:37602755

Abstract

BACKGROUND

The COVID-19 pandemic has significantly disrupted clinical experience and exposure of medical students and junior doctors. Artificial Intelligence (AI) integration in medical education has the potential to enhance learning and improve patient care. This study aimed to evaluate the effectiveness of three popular large language models (LLMs) in serving as clinical decision-making support tools for junior doctors.

METHODS

A series of increasingly complex clinical scenarios were presented to ChatGPT, Google's Bard and Bing's AI. Their responses were evaluated against standard guidelines, and for reliability by the Flesch Reading Ease Score, Flesch-Kincaid Grade Level, the Coleman-Liau Index, and the modified DISCERN score for assessing suitability. Lastly, the LLMs outputs were assessed by using the Likert scale for accuracy, informativeness, and accessibility by three experienced specialists.

RESULTS

In terms of readability and reliability, ChatGPT stood out among the three LLMs, recording the highest scores in Flesch Reading Ease (31.2 ± 3.5), Flesch-Kincaid Grade Level (13.5 ± 0.7), Coleman-Lau Index (13) and DISCERN (62 ± 4.4). These results suggest statistically significant superior comprehensibility and alignment with clinical guidelines in the medical advice given by ChatGPT. Bard followed closely behind, with BingAI trailing in all categories. The only non-significant statistical differences (P > 0.05) were found between ChatGPT and Bard's readability indices, and between the Flesch Reading Ease scores of ChatGPT/Bard and BingAI.

CONCLUSION

This study demonstrates the potential utility of LLMs in fostering self-directed and personalized learning, as well as bolstering clinical decision-making support for junior doctors. However further development is needed for its integration into education.

摘要

背景

COVID-19 大流行极大地扰乱了医学生和初级医生的临床经验和接触。人工智能（AI）在医学教育中的整合具有增强学习和改善患者护理的潜力。本研究旨在评估三种流行的大型语言模型（LLM）作为初级医生临床决策支持工具的有效性。

方法

向 ChatGPT、谷歌的 Bard 和必应的 AI 呈现一系列越来越复杂的临床场景。根据标准指南评估他们的反应，并使用 Flesch 阅读易读性评分、Flesch-Kincaid 年级水平、Coleman-Liau 指数和修改后的 DISCERN 评分评估可靠性，以评估适合度。最后，三位经验丰富的专家使用李克特量表评估 LLM 的输出的准确性、信息量和可及性。

结果

在可读性和可靠性方面，ChatGPT 在三种 LLM 中脱颖而出，在 Flesch 阅读易读性（31.2±3.5）、Flesch-Kincaid 年级水平（13.5±0.7）、Coleman-Lau 指数（13）和 DISCERN（62±4.4）方面得分最高。这些结果表明，在提供的医学建议中，ChatGPT 的理解度和与临床指南的一致性具有统计学上的显著优势。Bard 紧随其后，BingAI 在所有类别中都落后。唯一没有统计学差异的是（P>0.05）在 ChatGPT 和 Bard 的可读性指数之间，以及在 ChatGPT/Bard 的 Flesch 阅读易读性得分和 BingAI 之间。

结论

本研究表明 LLM 具有促进自我指导和个性化学习以及增强初级医生临床决策支持的潜力。然而，需要进一步开发才能将其整合到教育中。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

探讨创新型人工智能聊天机器人对后疫情时代医学教育和临床辅助的影响：全面分析。

Investigating the impact of innovative AI chatbot on post-pandemic medical education and clinical assistance: a comprehensive analysis.

机构信息

出版信息

BACKGROUND

METHODS

RESULTS

CONCLUSION

背景

方法

结果

结论

相似文献

引用本文的文献

探讨创新型人工智能聊天机器人对后疫情时代医学教育和临床辅助的影响：全面分析。

Investigating the impact of innovative AI chatbot on post-pandemic medical education and clinical assistance: a comprehensive analysis.

机构信息

出版信息

BACKGROUND

METHODS

RESULTS

CONCLUSION

背景

方法

结果

结论

相似文献

引用本文的文献