Basaran Alim Emre, Güresir Agi, Knoch Hanna, Vychopen Martin, Güresir Erdem, Wach Johannes
Department of Neurosurgery, University Hospital Leipzig, Leipzig, Saxony, Germany.
Neurosurg Rev. 2025 Jan 11;48(1):40. doi: 10.1007/s10143-025-03194-w.
To assess the predictive accuracy of advanced AI language models and established clinical scales in prognosticating outcomes for patients with aneurysmal subarachnoid hemorrhage (aSAH). This retrospective cohort study included 82 patients suffering from aSAH. We evaluated the predictive efficacy of AtlasGPT and ChatGPT 4.0 by examining the area under the curve (AUC), sensitivity, specificity, and Youden's Index, in comparison to established clinical grading scales such as the World Federation of Neurological Surgeons (WFNS) scale, Simplified Endovascular Brain Edema Score (SEBES), and Fisher scale. This assessment focused on four endpoints: in-hospital mortality, need for decompressive hemicraniectomy, and functional outcomes at discharge and after 6-month follow-up. In-hospital mortality occurred in 22% of the cohort, and 34.1% required decompressive hemicraniectomy during treatment. At hospital discharge, 28% of patients exhibited a favorable outcome (mRS ≤ 2), which improved to 46.9% at the 6-month follow-up. Prognostication utilizing the WFNS grading scale for 30-day in-hospital survival revealed an AUC of 0.72 with 59.4% sensitivity and 83.3% specificity. AtlasGPT provided the highest diagnostic accuracy (AUC 0.80, 95% CI: 0.70-0.91) for predicting the need for decompressive hemicraniectomy, with 82.1% sensitivity and 77.8% specificity. Similarly, for discharge outcomes, the WFNS score and AtlasGPT demonstrated high prognostic values with AUCs of 0.74 and 0.75, respectively. Long-term functional outcome predictions were best indicated by the WFNS scale, with an AUC of 0.76. The study demonstrates the potential of integrating AI models such as AtlasGPT with clinical scales to enhance outcome prediction in aSAH patients. While established scales like WFNS remain reliable, AI language models show promise, particularly in predicting the necessity for surgical intervention and short-term functional outcomes. The study explored the use of advanced AI language models, AtlasGPT and ChatGPT 4.0, to predict outcomes for patients with aneurysmal subarachnoid hemorrhage (aSAH). It found that AtlasGPT provided the highest diagnostic accuracy for predicting the need for decompressive hemicraniectomy, outperforming traditional clinical scales, while both AI models showed promise in enhancing outcome predictions when integrated with established clinical assessment tools.
评估先进的人工智能语言模型和既定临床量表对动脉瘤性蛛网膜下腔出血(aSAH)患者预后的预测准确性。这项回顾性队列研究纳入了82例aSAH患者。我们通过检查曲线下面积(AUC)、敏感性、特异性和尤登指数,评估了AtlasGPT和ChatGPT 4.0的预测效果,并与世界神经外科医师联合会(WFNS)量表、简化血管内脑水肿评分(SEBES)和费希尔量表等既定临床分级量表进行比较。该评估聚焦于四个终点:住院死亡率、减压性颅骨切除术需求以及出院时和6个月随访后的功能结局。队列中有22%的患者发生住院死亡,34.1%的患者在治疗期间需要进行减压性颅骨切除术。出院时,28%的患者预后良好(改良Rankin量表评分[mRS]≤2),在6个月随访时这一比例提高到46.9%。利用WFNS分级量表对30天住院生存率进行预后评估,AUC为0.72,敏感性为59.4%,特异性为83.3%。AtlasGPT在预测减压性颅骨切除术需求方面提供了最高的诊断准确性(AUC 0.80,95%置信区间:0.70 - 0.91),敏感性为82.1%,特异性为77.8%。同样,对于出院结局,WFNS评分和AtlasGPT的预后价值较高,AUC分别为0.74和0.75。长期功能结局预测以WFNS量表最佳,AUC为0.76。该研究证明了将AtlasGPT等人工智能模型与临床量表相结合以增强aSAH患者结局预测的潜力。虽然像WFNS这样的既定量表仍然可靠,但人工智能语言模型显示出前景,特别是在预测手术干预的必要性和短期功能结局方面。该研究探索了使用先进的人工智能语言模型AtlasGPT和ChatGPT 4.(此处原文有误,应为ChatGPT 4.0)来预测动脉瘤性蛛网膜下腔出血(aSAH)患者的结局。研究发现,AtlasGPT在预测减压性颅骨切除术需求方面提供了最高的诊断准确性,优于传统临床量表,而当与既定临床评估工具相结合时,这两种人工智能模型在增强结局预测方面均显示出前景。