文献检索文档翻译深度研究
Suppr Zotero 插件Zotero 插件
邀请有礼套餐&价格历史记录

新学期,新优惠

限时优惠:9月1日-9月22日

30天高级会员仅需29元

1天体验卡首发特惠仅需5.99元

了解详情
不再提醒
插件&应用
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
高级版
套餐订阅购买积分包
AI 工具
文献检索文档翻译深度研究
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2025

Cost, Usability, Credibility, Fairness, Accountability, Transparency, and Explainability Framework for Safe and Effective Large Language Models in Medical Education: Narrative Review and Qualitative Study.

作者信息

Quttainah Majdi, Mishra Vinaytosh, Madakam Somayya, Lurie Yotam, Mark Shlomo

机构信息

College of Business Administration, Kuwait University, Kuwait, Kuwait.

College of Healthcare Management and Economics, Gulf Medical University, Ajman, United Arab Emirates.

出版信息

JMIR AI. 2024 Apr 23;3:e51834. doi: 10.2196/51834.


DOI:10.2196/51834
PMID:38875562
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11077408/
Abstract

BACKGROUND: The world has witnessed increased adoption of large language models (LLMs) in the last year. Although the products developed using LLMs have the potential to solve accessibility and efficiency problems in health care, there is a lack of available guidelines for developing LLMs for health care, especially for medical education. OBJECTIVE: The aim of this study was to identify and prioritize the enablers for developing successful LLMs for medical education. We further evaluated the relationships among these identified enablers. METHODS: A narrative review of the extant literature was first performed to identify the key enablers for LLM development. We additionally gathered the opinions of LLM users to determine the relative importance of these enablers using an analytical hierarchy process (AHP), which is a multicriteria decision-making method. Further, total interpretive structural modeling (TISM) was used to analyze the perspectives of product developers and ascertain the relationships and hierarchy among these enablers. Finally, the cross-impact matrix-based multiplication applied to a classification (MICMAC) approach was used to determine the relative driving and dependence powers of these enablers. A nonprobabilistic purposive sampling approach was used for recruitment of focus groups. RESULTS: The AHP demonstrated that the most important enabler for LLMs was credibility, with a priority weight of 0.37, followed by accountability (0.27642) and fairness (0.10572). In contrast, usability, with a priority weight of 0.04, showed negligible importance. The results of TISM concurred with the findings of the AHP. The only striking difference between expert perspectives and user preference evaluation was that the product developers indicated that cost has the least importance as a potential enabler. The MICMAC analysis suggested that cost has a strong influence on other enablers. The inputs of the focus group were found to be reliable, with a consistency ratio less than 0.1 (0.084). CONCLUSIONS: This study is the first to identify, prioritize, and analyze the relationships of enablers of effective LLMs for medical education. Based on the results of this study, we developed a comprehendible prescriptive framework, named CUC-FATE (Cost, Usability, Credibility, Fairness, Accountability, Transparency, and Explainability), for evaluating the enablers of LLMs in medical education. The study findings are useful for health care professionals, health technology experts, medical technology regulators, and policy makers.

摘要
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/550f/11077408/ba3f270b8390/ai_v3i1e51834_fig3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/550f/11077408/2e886be35d0d/ai_v3i1e51834_fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/550f/11077408/929851e5bf17/ai_v3i1e51834_fig2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/550f/11077408/ba3f270b8390/ai_v3i1e51834_fig3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/550f/11077408/2e886be35d0d/ai_v3i1e51834_fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/550f/11077408/929851e5bf17/ai_v3i1e51834_fig2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/550f/11077408/ba3f270b8390/ai_v3i1e51834_fig3.jpg

相似文献

[1]
Cost, Usability, Credibility, Fairness, Accountability, Transparency, and Explainability Framework for Safe and Effective Large Language Models in Medical Education: Narrative Review and Qualitative Study.

JMIR AI. 2024-4-23

[2]
The Role of Large Language Models in Transforming Emergency Medicine: Scoping Review.

JMIR Med Inform. 2024-5-10

[3]
Large Language Models and User Trust: Consequence of Self-Referential Learning Loop and the Deskilling of Health Care Professionals.

J Med Internet Res. 2024-4-25

[4]
Potential of Large Language Models in Health Care: Delphi Study.

J Med Internet Res. 2024-5-13

[5]
Assessing the Alignment of Large Language Models With Human Values for Mental Health Integration: Cross-Sectional Study Using Schwartz's Theory of Basic Values.

JMIR Ment Health. 2024-4-9

[6]
Evaluation of the Performance of Generative AI Large Language Models ChatGPT, Google Bard, and Microsoft Bing Chat in Supporting Evidence-Based Dentistry: Comparative Mixed Methods Study.

J Med Internet Res. 2023-12-28

[7]
Quality of Answers of Generative Large Language Models Versus Peer Users for Interpreting Laboratory Test Results for Lay Patients: Evaluation Study.

J Med Internet Res. 2024-4-17

[8]
Learning to Make Rare and Complex Diagnoses With Generative AI Assistance: Qualitative Study of Popular Large Language Models.

JMIR Med Educ. 2024-2-13

[9]
Quality of Answers of Generative Large Language Models vs Peer Patients for Interpreting Lab Test Results for Lay Patients: Evaluation Study.

ArXiv. 2024-1-23

[10]
A Systematic Review of ChatGPT and Other Conversational Large Language Models in Healthcare.

medRxiv. 2024-4-27

引用本文的文献

[1]
Evaluating the role of large language models in traditional Chinese medicine diagnosis and treatment recommendations.

NPJ Digit Med. 2025-7-21

[2]
Utilizing AI-Powered Thematic Analysis: Methodology, Implementation, and Lessons Learned.

Cureus. 2025-6-4

[3]
AI in Home Care-Evaluation of Large Language Models for Future Training of Informal Caregivers: Observational Comparative Case Study.

J Med Internet Res. 2025-4-28

[4]
How to incorporate generative artificial intelligence in nephrology fellowship education.

J Nephrol. 2024-12

本文引用的文献

[1]
A vignette-based evaluation of ChatGPT's ability to provide appropriate and equitable medical advice across care contexts.

Sci Rep. 2023-10-19

[2]
Response to: "The next paradigm shift? ChatGPT, artificial intelligence, and medical education".

Med Teach. 2024-1

[3]
Integrating ChatGPT in Medical Education: Adapting Curricula to Cultivate Competent Physicians for the AI Era.

Cureus. 2023-8-6

[4]
Assessing the Utility of ChatGPT Throughout the Entire Clinical Workflow: Development and Usability Study.

J Med Internet Res. 2023-8-22

[5]
Utility of ChatGPT in Clinical Practice.

J Med Internet Res. 2023-6-28

[6]
Artificial Intelligence Can Generate Fraudulent but Authentic-Looking Scientific Medical Articles: Pandora's Box Has Been Opened.

J Med Internet Res. 2023-5-31

[7]
ChatGPT in medical practice, education and research: malpractice and plagiarism.

Clin Med (Lond). 2023-5

[8]
ChatGPT in Medical Education: a Paradigm Shift or a Dangerous Tool?

Acad Psychiatry. 2023-8

[9]
Revolutionizing Medical Education: Can ChatGPT Boost Subjective Learning and Expression?

J Med Syst. 2023-5-9

[10]
ChatGPT Utility in Healthcare Education, Research, and Practice: Systematic Review on the Promising Perspectives and Valid Concerns.

Healthcare (Basel). 2023-3-19

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

推荐工具

医学文档翻译智能文献检索