Suppr超能文献

评估关于活体肾捐赠的人工智能生成信息的质量和可读性。

Evaluating Quality and Readability of AI-generated Information on Living Kidney Donation.

作者信息

Villani Vincenzo, Nguyen Hong-Hanh T, Shanmugarajah Kumaran

机构信息

Division of Immunology and Organ Transplantation, McGovern Medical School at UTHealth Houston, Houston, TX.

Liver Specialists of Texas, Houston, TX.

出版信息

Transplant Direct. 2024 Dec 10;11(1):e1740. doi: 10.1097/TXD.0000000000001740. eCollection 2025 Jan.

Abstract

BACKGROUND

The availability of high-quality and easy-to-read informative material is crucial to providing accurate information to prospective kidney donors. The quality of this information has been associated with the likelihood of proceeding with a living donation. Artificial intelligence-based large language models (LLMs) have recently become common instruments for acquiring information online, including medical information. The aim of this study was to assess the quality and readability of artificial intelligence-generated information on kidney donation.

METHODS

A set of 35 common donor questions was developed by the authors and used to interrogate 3 LLMs (ChatGPT, Google Gemini, and MedGPT). Answers were collected and independently evaluated using the CLEAR tool for (1) completeness, (2) lack of false information, (3) evidence-based information, (4) appropriateness, and (5) relevance. Readability was evaluated using the Flesch-Kincaid Reading Ease Score and the Flesch-Kincaid Grade Level.

RESULTS

The interrater intraclass correlation was 0.784 (95% confidence interval, 0.716-0.814). Median CLEAR scores were ChatGPT 22 (interquartile range [IQR], 3.67), Google Gemini 24.33 (IQR, 2.33), and MedGPT 23.33 (IQR, 2.00). ChatGPT, Gemini, and MedGPT had mean Flesch-Kincaid Reading Ease Scores of 37.32 (SD = 10.00), 39.42 (SD = 13.49), and 29.66 (SD = 7.94), respectively. Using the Flesch-Kincaid Grade Level assessment, ChatGPT had an average score of 12.29, Gemini had 10.63, and MedGPT had 13.21 ( < 0.001), indicating that all LLMs had a readability at the college-level education.

CONCLUSIONS

Current LLM provides fairly accurate responses to common prospective living kidney donor questions; however, the generated information is complex and requires an advanced level of education. As LLMs become more relevant in the field of medical information, transplant providers should familiarize themselves with the shortcomings of these technologies.

摘要

背景

提供高质量且易于阅读的信息材料对于向潜在的肾脏捐赠者提供准确信息至关重要。该信息的质量与进行活体捐赠的可能性相关。基于人工智能的大语言模型(LLMs)最近已成为在线获取信息(包括医学信息)的常用工具。本研究的目的是评估人工智能生成的关于肾脏捐赠信息的质量和可读性。

方法

作者制定了一组35个常见的捐赠者问题,并用于询问3个大语言模型(ChatGPT、谷歌Gemini和医脉通GPT)。收集答案并使用CLEAR工具独立评估(1)完整性,(2)无虚假信息,(3)基于证据的信息,(4)适当性,以及(5)相关性。使用弗莱什-金凯德易读性评分和弗莱什-金凯德年级水平评估可读性。

结果

评分者间组内相关性为0.784(95%置信区间,0.716 - 0.814)。CLEAR评分中位数分别为ChatGPT 22(四分位间距[IQR],3.67)、谷歌Gemini 24.33(IQR,2.33)和医脉通GPT 23.33(IQR,2.00)。ChatGPT、Gemini和医脉通GPT的平均弗莱什-金凯德易读性评分分别为37.32(标准差 = 10.00)、39.42(标准差 = 13.49)和29.66(标准差 = 7.94)。使用弗莱什-金凯德年级水平评估,ChatGPT的平均评分为12.29,Gemini为10.63,医脉通GPT为13.21(<0.001),表明所有大语言模型的可读性都处于大学教育水平。

结论

当前的大语言模型对常见的潜在活体肾脏捐赠者问题提供了相当准确的回答;然而,生成的信息很复杂,需要高等教育水平。随着大语言模型在医学信息领域变得更加重要,移植提供者应熟悉这些技术的缺点。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验