Suppr超能文献

对ChatGPT生成的针对代谢功能障碍相关脂肪性肝病患者的医学阿拉伯语回复的评估。

Assessment of ChatGPT-generated medical Arabic responses for patients with metabolic dysfunction-associated steatotic liver disease.

作者信息

Alqahtani Saleh A, AlAhmed Reem S, AlOmaim Waleed S, Alghamdi Saad, Al-Hamoudi Waleed, Bzeizi Khalid Ibrahim, Albenmousa Ali, Aghemo Alessio, Pugliese Nicola, Hassan Cesare, Abaalkhail Faisal A

机构信息

Liver, Digestive, and Lifestyle Health Research Section, and Organ Transplant Center of Excellence, King Faisal Specialist Hospital and Research Center, Riyadh, Saudi Arabia.

Division of Gastroenterology and Hepatology, Weill Cornell Medicine, New York, New York, United States of America.

出版信息

PLoS One. 2025 Feb 3;20(2):e0317929. doi: 10.1371/journal.pone.0317929. eCollection 2025.

Abstract

BACKGROUND AND AIM

Artificial intelligence (AI)-powered chatbots, such as Chat Generative Pretrained Transformer (ChatGPT), have shown promising results in healthcare settings. These tools can help patients obtain real-time responses to queries, ensuring immediate access to relevant information. The study aimed to explore the potential use of ChatGPT-generated medical Arabic responses for patients with metabolic dysfunction-associated steatotic liver disease (MASLD).

METHODS

An English patient questionnaire on MASLD was translated to Arabic. The Arabic questions were then entered into ChatGPT 3.5 on November 12, 2023. The responses were evaluated for accuracy, completeness, and comprehensibility by 10 Saudi MASLD experts who were native Arabic speakers. Likert scales were used to evaluate: 1) Accuracy, 2) Completeness, and 3) Comprehensibility. The questions were grouped into 3 domains: (1) Specialist referral, (2) Lifestyle, and (3) Physical activity.

RESULTS

Accuracy mean score was 4.9 ± 0.94 on a 6-point Likert scale corresponding to "Nearly all correct." Kendall's coefficient of concordance (KCC) ranged from 0.025 to 0.649, with a mean of 0.28, indicating moderate agreement between all 10 experts. Mean completeness score was 2.4 ± 0.53 on a 3-point Likert scale corresponding to "Comprehensive" (KCC: 0.03-0.553; mean: 0.22). Comprehensibility mean score was 2.74 ± 0.52 on a 3-point Likert scale, which indicates the responses were "Easy to understand" (KCC: 0.00-0.447; mean: 0.25).

CONCLUSION

MASLD experts found that ChatGPT responses were accurate, complete, and comprehensible. The results support the increasing trend of leveraging the power of AI chatbots to revolutionize the dissemination of information for patients with MASLD. However, many AI-powered chatbots require further enhancement of scientific content to avoid the risks of circulating medical misinformation.

摘要

背景与目的

诸如聊天生成预训练变换器(ChatGPT)等由人工智能驱动的聊天机器人在医疗环境中已显示出有前景的结果。这些工具可帮助患者获得对问题的实时回复,确保能立即获取相关信息。本研究旨在探索ChatGPT生成的医学阿拉伯语回复对代谢功能障碍相关脂肪性肝病(MASLD)患者的潜在用途。

方法

一份关于MASLD的英文患者问卷被翻译成阿拉伯语。然后于2023年11月12日将这些阿拉伯语问题输入ChatGPT 3.5。由10位以阿拉伯语为母语的沙特MASLD专家对回复的准确性、完整性和可理解性进行评估。使用李克特量表来评估:1)准确性,2)完整性,3)可理解性。问题被分为3个领域:(1)专科转诊,(2)生活方式,(3)身体活动。

结果

在6分制李克特量表上,准确性平均得分为4.9±0.94,对应“几乎全部正确”。肯德尔和谐系数(KCC)范围为0.025至0.649,平均值为0.28,表明所有10位专家之间存在中等程度的一致性。在3分制李克特量表上,完整性平均得分为2.4±0.53,对应“全面”(KCC:0.03 - 0.553;平均值:0.22)。在3分制李克特量表上,可理解性平均得分为2.74±0.52,这表明回复“易于理解”(KCC:0.00 - 0.447;平均值:0.25)。

结论

MASLD专家发现ChatGPT的回复准确、完整且可理解。这些结果支持利用人工智能聊天机器人的力量来彻底改变MASLD患者信息传播方式的这一不断增长的趋势。然而,许多由人工智能驱动的聊天机器人需要进一步增强科学内容,以避免传播医学错误信息的风险。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0db4/11790096/9cc655dd0b28/pone.0317929.g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验