Zeljkovic Ivan, Novak Matea, Jordan Ana, Lisicic Ante, Nemeth-Blažić Tatjana, Pavlovic Nikola, Manola Šime
Department of Cardiovascular Diseases, Dubrava University Hospital, Avenija Gojka Šuška, Zagreb, Croatia.
Catholic University of Croatia, Zagreb, Croatia.
Heart Rhythm O2. 2024 Oct 19;6(1):58-63. doi: 10.1016/j.hroo.2024.10.005. eCollection 2025 Jan.
As artificial intelligence and large language models continue to evolve, their application in health care is expanding. OpenAI's Chat Generative Pre-trained Transformer 4 (ChatGPT-4) represents the latest advancement in this technology, capable of engaging in complex dialogues and providing information.
This study explores the correctness of ChatGPT-4 in informing patients about atrial fibrillation.
This cross-sectional observational study involved ChatGPT-4 in responding to a structured set of 108 questions across 10 categories related to atrial fibrillation. These categories included basic information, treatment options, lifestyle adjustments, and more, reflecting common patient inquiries. The model's responses were evaluated by a panel of 3 cardiologists on the basis of accuracy, comprehensiveness, clarity, relevance to clinical practice, and patient safety. The total correctness of ChatGPT-4 was quantitatively assessed through scores assigned in each category, and statistical analysis was performed to identify significant differences in performance across categories.
ChatGPT-4 provided correct and relevant answers with considerable variability across categories. It excelled in "Lifestyle Adjustments" and "Daily Life and Management" with perfect and near-perfect scores but struggled with "Miscellaneous Concerns" scoring lower. Statistical analysis confirmed significant differences in total scores across categories ( = .020).
Our results suggest that while ChatGPT-4 is reliable in categories with structured and direct queries, it shows limitations when handling complex medical queries that require in-depth explanations or clinical judgment. ChatGPT-4 demonstrates promising potential as a tool for patient-focused informing in atrial fibrillation, particularly in straightforward informing content.
随着人工智能和大语言模型不断发展,它们在医疗保健领域的应用正在扩大。OpenAI的聊天生成预训练变换器4(ChatGPT-4)代表了这项技术的最新进展,能够进行复杂对话并提供信息。
本研究探讨ChatGPT-4在向患者介绍心房颤动方面的正确性。
这项横断面观察性研究让ChatGPT-4回答与心房颤动相关的10个类别的108个结构化问题。这些类别包括基本信息、治疗选择、生活方式调整等,反映了患者常见的疑问。3位心脏病专家组成的小组根据准确性、全面性、清晰度、与临床实践的相关性以及患者安全性对该模型的回答进行评估。通过在每个类别中分配的分数对ChatGPT-4的总体正确性进行定量评估,并进行统计分析以确定不同类别之间性能的显著差异。
ChatGPT-4提供了正确且相关的答案,但不同类别之间存在相当大的差异。它在“生活方式调整”和“日常生活与管理”方面表现出色,得分完美或接近完美,但在“其他问题”方面表现不佳,得分较低。统计分析证实不同类别之间的总分存在显著差异(P = .020)。
我们的结果表明,虽然ChatGPT-4在处理结构化和直接查询的类别中是可靠的,但在处理需要深入解释或临床判断的复杂医学查询时存在局限性。ChatGPT-4作为一种以患者为中心的心房颤动信息告知工具,特别是在简单的告知内容方面,显示出有前景的潜力。