Yu Boyang, Melmed Kara R, Frontera Jennifer, Zhu Weicheng, Huang Haoxu, Qureshi Adnan I, Maggard Abigail, Steinhof Michael, Kuohn Lindsey, Kumar Arooshi, Berson Elisa R, Tran Anh T, Payabvash Seyedmehdi, Ironside Natasha, Brush Benjamin, Dehkharghani Seena, Razavian Narges, Ranganath Rajesh
Center for Data Science, New York University, New York, NY, USA.
Department of Neurology, NYU Grossman School of Medicine, New York, NY, USA.
Neurocrit Care. 2025 Feb 7. doi: 10.1007/s12028-025-02214-3.
Early prediction of hematoma expansion (HE) following nontraumatic intracerebral hemorrhage (ICH) may inform preemptive therapeutic interventions. We sought to identify how accurately machine learning (ML) radiomics models predict HE compared with expert clinicians using head computed tomography (HCT).
We used data from 900 study participants with ICH enrolled in the Antihypertensive Treatment of Acute Cerebral Hemorrhage 2 Study. ML models were developed using baseline HCT images, as well as admission clinical data in a training cohort (n = 621), and their performance was evaluated in an independent test cohort (n = 279) to predict HE (defined as HE by 33% or > 6 mL at 24 h). We simultaneously surveyed expert clinicians and asked them to predict HE using the same initial HCT images and clinical data. Area under the receiver operating characteristic curve (AUC) were compared between clinician predictions, ML models using radiomic data only (a random forest classifier and a deep learning imaging model) and ML models using both radiomic and clinical data (three random forest classifier models using different feature combinations). Kappa values comparing interrater reliability among expert clinicians were calculated. The best performing model was compared with clinical predication.
The AUC for expert clinician prediction of HE was 0.591, with a kappa of 0.156 for interrater variability, compared with ML models using radiomic data only (a deep learning model using image input, AUC 0.680) and using both radiomic and clinical data (a random forest model, AUC 0.677). The intraclass correlation coefficient for clinical judgment and the best performing ML model was 0.47 (95% confidence interval 0.23-0.75).
We introduced supervised ML algorithms demonstrating that HE prediction may outperform practicing clinicians. Despite overall moderate AUCs, our results set a new relative benchmark for performance in these tasks that even expert clinicians find challenging. These results emphasize the need for continued improvements and further enhanced clinical decision support to optimally manage patients with ICH.
非创伤性脑出血(ICH)后血肿扩大(HE)的早期预测可为抢先治疗干预提供依据。我们试图确定与使用头部计算机断层扫描(HCT)的专家临床医生相比,机器学习(ML)放射组学模型预测HE的准确性如何。
我们使用了来自参与急性脑出血2研究的抗高血压治疗的900名ICH研究参与者的数据。使用基线HCT图像以及训练队列(n = 621)中的入院临床数据开发ML模型,并在独立测试队列(n = 279)中评估其预测HE(定义为24小时时血肿扩大33%或>6 mL)的性能。我们同时对专家临床医生进行调查,并要求他们使用相同的初始HCT图像和临床数据预测HE。比较临床医生预测、仅使用放射组学数据的ML模型(随机森林分类器和深度学习成像模型)以及使用放射组学和临床数据的ML模型(使用不同特征组合的三个随机森林分类器模型)之间的受试者工作特征曲线下面积(AUC)。计算比较专家临床医生之间评分者间可靠性的Kappa值。将表现最佳的模型与临床预测进行比较。
专家临床医生预测HE时的AUC为0.591,评分者间变异性的Kappa值为0.156,相比之下,仅使用放射组学数据的ML模型(使用图像输入的深度学习模型,AUC 0.680)以及使用放射组学和临床数据的ML模型(随机森林模型,AUC 0.677)。临床判断与表现最佳的ML模型之间的组内相关系数为0.47(95%置信区间0.23 - 0.75)。
我们引入了监督式ML算法,证明HE预测可能优于执业临床医生。尽管总体AUC中等,但我们的结果为这些任务的性能设定了一个新的相对基准,即使是专家临床医生也认为具有挑战性。这些结果强调了持续改进和进一步加强临床决策支持以优化管理ICH患者的必要性。