CHU de Caen Normandie, Department of Anesthesiology and Critical Care Medicine, Caen, France.
Normandie Univ, UNICAEN, INSERM, U1237, PhIND "Physiopathology and imaging of Neurological Disorders", Institut Blood and Brain @ Caen-Normandie, Cyceron, Caen, France.
Crit Care Med. 2024 Jun 1;52(6):942-950. doi: 10.1097/CCM.0000000000006236. Epub 2024 Mar 6.
To evaluate the capacity of ChatGPT, a widely accessible and uniquely popular artificial intelligence-based chatbot, in predicting the 6-month outcome following moderate-to-severe traumatic brain injury (TBI).
Single-center observational retrospective study.
Data are from a neuro-ICU from a level 1 trauma center.
All TBI patients admitted to ICU between September 2021 and October 2022 were included in a prospective database.
None.
Based on anonymized clinical, imaging, and biological information available at the patients' hospital admission and extracted from the database, clinical vignettes were retrospectively submitted to ChatGPT for prediction of patients' outcomes. The predictions of two intensivists (one neurointensivist and one non-neurointensivist) both from another level 1 trauma center (Beaujon Hospital), were also collected as was the International Mission on Prognosis and Analysis of Clinical Trials in Traumatic Brain Injury (IMPACT) scoring. Each intensivist, as well as ChatGPT, made their prognostic evaluations independently, without knowledge of the others' predictions and of the patients' actual management and outcome. Both the intensivists and ChatGPT were given access to the exact same set of information. The main outcome was a 6-month-functional status dichotomized into favorable (Glasgow Outcome Scale Extended [GOSE] ≥ 5) versus poor (GOSE < 5). Prediction of intracranial hypertension management, pulmonary infectious risk, and removal of life-sustaining therapies was also investigated as secondary outcomes. Eighty consecutive moderate-to-severe TBI patients were included. For the 6-month outcome prognosis, area under the receiver operating characteristic curve (AUC-ROC) for ChatGPT, the neurointensivist, the non-neurointensivist, and IMPACT were, respectively, 0.62 (0.50-0.74), 0.70 (0.59-0.82), 0.71 (0.59-0.82), and 0.81 (0.72-0.91). ChatGPT had the highest sensitivity (100%), but the lowest specificity (26%). For secondary outcomes, ChatGPT's prognoses were generally less accurate than clinicians' prognoses, with lower AUC values for most outcomes.
This study does not support the use of ChatGPT for prediction of outcomes after TBI.
评估 ChatGPT 的能力,ChatGPT 是一种广泛应用且备受欢迎的人工智能聊天机器人,可预测中重度创伤性脑损伤(TBI)后的 6 个月结局。
单中心观察性回顾性研究。
数据来自于 1 级创伤中心的神经重症监护室。
纳入了 2021 年 9 月至 2022 年 10 月期间入住 ICU 的所有 TBI 患者,他们均被前瞻性纳入数据库。
无。
根据患者入院时的匿名临床、影像学和生物学信息,并从数据库中提取,对 ChatGPT 进行回顾性提交临床病例以预测患者结局。还收集了来自另一个 1 级创伤中心(Beaujon 医院)的两名重症监护医生(一名神经重症监护医生和一名非神经重症监护医生)的预测结果,以及国际创伤性脑损伤预后分析和临床试验(IMPACT)评分。每位重症监护医生以及 ChatGPT 均独立进行预后评估,彼此之间均不了解其他人的预测结果以及患者的实际管理和结局。两名重症监护医生和 ChatGPT 都可以访问完全相同的信息集。主要结局是 6 个月的功能状态分为有利(格拉斯哥预后评分扩展[GOSE]≥5)和不良(GOSE<5)。还调查了颅内压升高管理、肺部感染风险和生命支持治疗的撤除作为次要结局。纳入了 80 例连续的中重度 TBI 患者。对于 6 个月的结局预测,ChatGPT、神经重症监护医生、非神经重症监护医生和 IMPACT 的受试者工作特征曲线下面积(AUC-ROC)分别为 0.62(0.50-0.74)、0.70(0.59-0.82)、0.71(0.59-0.82)和 0.81(0.72-0.91)。ChatGPT 的灵敏度最高(100%),但特异性最低(26%)。对于次要结局,ChatGPT 的预测通常不如临床医生的预测准确,大多数结局的 AUC 值较低。
这项研究不支持使用 ChatGPT 预测 TBI 后的结局。