Tan C W, Chan J C Y, Chan J J I, Nagarajan S, Sng B L
Department of Women's Anesthesia, KK Women's and Children's Hospital, Singapore; Anesthesiology and Perioperative Sciences Academic Clinical Program, Duke-NUS Medical School, Singapore.
Department of Women's Anesthesia, KK Women's and Children's Hospital, Singapore.
Int J Obstet Anesth. 2025 Aug;63:104688. doi: 10.1016/j.ijoa.2025.104688. Epub 2025 May 20.
Recent studies evaluating frequently asked questions (FAQs) on labor epidural analgesia (LEA) only used generic questions without incorporating detailed clinical information that reflects patient-specific inputs. We investigated the performance of ChatGPT in addressing these questions related to LEA with an emphasis on individual preferences and clinical conditions.
Twenty-nine questions for the AI chatbot were generated from the commonly asked questions relating to LEA based on clinical conditions. The generation of responses was performed in January 2025 with each question under individual sub-topics initiated as a "New chat" in ChatGPT-4o. Upon having the first questions answered, subsequent question(s) in the same sub-topic were continued in the same chat following the sequences as predefined. The readability of each response was graded using six readability indices, while the accuracy, Patient Education Materials Assessment Tool for Print (PEMAT) understandability and actionability was assessed by four obstetric anesthesiologists.
The mean readability indices of the ChatGPT-4o responses to the questions were generally rated as fairly difficult to very difficult, which corresponded to a US grade level between 11th grade to college level entry. The mean (± standard deviation) accuracy of the responses was 97.7% ± 8.1%. The PEMAT understandability and actionability scores were 97.9% ± 0.9%) and 98.0% ± 1.4%), respectively.
ChatGPT can provide accurate and readable information about LEA even under different clinical contexts. However, improvement is needed to refine the responses with suitable prompts to simplify the outputs and improve readability. These approaches will thereby meet the need for the effective delivery of reliable patient education information.
近期评估分娩硬膜外镇痛(LEA)常见问题(FAQs)的研究仅使用了一般性问题,未纳入反映患者特定输入的详细临床信息。我们研究了ChatGPT在解决这些与LEA相关问题方面的表现,重点关注个体偏好和临床情况。
基于临床情况,从与LEA相关的常见问题中生成了29个供人工智能聊天机器人回答的问题。2025年1月进行了回答生成,每个问题在各个子主题下作为ChatGPT-4o中的“新聊天”发起。第一个问题得到回答后,同一子主题中的后续问题按照预定义顺序在同一聊天中继续。使用六个可读性指标对每个回答的可读性进行评分,同时由四位产科麻醉医生评估回答的准确性、印刷版患者教育材料评估工具(PEMAT)的可理解性和可操作性。
ChatGPT-4o对这些问题的回答的平均可读性指标总体上被评为相当困难到非常困难,这对应于美国11年级到大学入学水平之间的年级。回答的平均(±标准差)准确率为97.7%±8.1%。PEMAT的可理解性和可操作性得分分别为97.9%±0.9%和98.0%±1.4%。
即使在不同的临床背景下,ChatGPT也能提供关于LEA的准确且可读的信息。然而,需要改进以通过合适的提示来优化回答,简化输出并提高可读性。这些方法将满足有效提供可靠患者教育信息的需求。