Suppr超能文献

评估ChatGPT回答关于长期意识障碍问题的准确性。

Assessing the Accuracy of ChatGPT in Answering Questions About Prolonged Disorders of Consciousness.

作者信息

Bagnato Sergio, Boccagni Cristina, Bonavita Jacopo

机构信息

Villa Rosa Rehabilitation Hospital, Provincial Agency for Health Services (APSS) of Trento, 38057 Pergine Valsugana, Italy.

出版信息

Brain Sci. 2025 Apr 13;15(4):392. doi: 10.3390/brainsci15040392.

Abstract

: Prolonged disorders of consciousness (DoC) present complex diagnostic and therapeutic challenges. This study aimed to evaluate the accuracy of two ChatGPT models (ChatGPT 4o and ChatGPT o1) in answering questions about prolonged DoC, framed as if they were posed by a patient's relative. Secondary objectives included comparing performance across languages (English vs. Italian) and assessing whether responses conveyed an empathetic tone. : Fifty-seven open-ended questions reflecting common caregiver concerns were generated in both English and Italian, each categorized into one of three domains: clinical data, instrumental diagnostics, or therapy. Each question contained a background context followed by a specific query and was submitted once to both models. Two reviewers evaluated the responses on a four-point scale, ranging from "incorrect and potentially misleading" to "correct and complete". Discrepancies were resolved by a third reviewer. Accuracy, language differences, empathy, and recommendation to consult a healthcare professional were analyzed using absolute frequencies, percentages, the Mann-Whitney U test, and Chi-squared tests. : A total of 228 responses were analyzed. Both models provided predominantly correct answers (80.7-96.8%), with English responses achieving higher accuracy only for ChatGPT 4o on clinical data. ChatGPT 4o exhibited greater empathy in its responses, whereas ChatGPT o1 more frequently recommended consulting a healthcare professional in Italian. : Both ChatGPT models demonstrated high accuracy in addressing prolonged DoC queries, highlighting their potential usefulness for caregiver support. However, occasional inaccuracies emphasize the importance of verifying chatbot-generated information with professional medical advice.

摘要

长期意识障碍(DoC)带来了复杂的诊断和治疗挑战。本研究旨在评估两个ChatGPT模型(ChatGPT 4o和ChatGPT o1)回答有关长期DoC问题的准确性,这些问题被设定为由患者亲属提出。次要目标包括比较不同语言(英语与意大利语)的表现,并评估回答是否传达了同理心。:用英语和意大利语生成了57个反映护理人员常见担忧的开放式问题,每个问题分为三个领域之一:临床数据、仪器诊断或治疗。每个问题都包含一个背景信息,随后是一个具体问题,并分别提交给两个模型一次。两名评审员以四分制对回答进行评估,范围从“不正确且可能有误导性”到“正确且完整”。分歧由第三名评审员解决。使用绝对频率、百分比、曼-惠特尼U检验和卡方检验分析准确性、语言差异、同理心以及建议咨询医疗保健专业人员的情况。:共分析了228个回答。两个模型提供的主要是正确答案(80.7 - 96.8%),仅在临床数据方面,ChatGPT 4o的英语回答准确性更高。ChatGPT 4o在回答中表现出更强的同理心,而ChatGPT o1在意大利语回答中更频繁地建议咨询医疗保健专业人员。:两个ChatGPT模型在处理长期DoC问题方面都显示出较高的准确性,突出了它们对护理人员支持的潜在有用性。然而,偶尔的不准确强调了用专业医疗建议核实聊天机器人生成信息的重要性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b9e2/12025412/48c6b761d004/brainsci-15-00392-g001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验