Morgan Howard E, Wang Kai, Dohopolski Michael, Liang Xiao, Folkert Michael R, Sher David J, Wang Jing
Department of Radiation Oncology, University of Texas Southwestern Medical Center, Dallas, TX, USA.
Medical Artificial Intelligence and Automation Laboratory, University of Texas Southwestern Medical Center, Dallas, TX, USA.
Quant Imaging Med Surg. 2021 Dec;11(12):4781-4796. doi: 10.21037/qims-21-274.
Local failure (LF) following chemoradiation (CRT) for head and neck cancer is associated with poor overall survival. If machine learning techniques could stratify patients at risk of treatment failure based on baseline and intra-treatment imaging, such a model could facilitate response-adapted approaches to escalate, de-escalate, or switch therapy.
A 1:2 retrospective case control cohort of patients treated at a single institution with definitive radiotherapy for head and neck cancer who failed locally, in-field at a primary or nodal structure were included. Radiomic features were extracted from baseline CT and CBCT scans at fractions 1 and 21 (delta) of radiotherapy with PyRadiomics and were selected for by: reproducibility (intra-class correlation coefficients ≥0.95), redundancy [maximum relevance and minimum redundancy (mRMR)], and informativeness [recursive feature elimination (RFE)]. Separate models predicting LF of primaries or nodes were created using the explainable boosting machine (EBM) classifier with 5-fold cross-validation for (I) clinical only, (II) radiomic only (CT1 and delta features), and (III) fused models (clinical + radiomic). Twenty-five iterations were performed, and predicted scores were averaged with a parallel ensemble design. Receiver operating characteristic curves were compared between models with paired-samples -tests.
The fused ensemble model for primaries (using clinical, CT1, and delta features) achieved an AUC of 0.871 with a sensitivity of 78.3% and specificity of 90.9% at the maximum Youden J statistic. The fused ensemble model trended towards improvement when compared to the clinical only ensemble model (AUC =0.788, P=0.134) but reached significance when compared to the radiomic ensemble model (AUC =0.770, P=0.017). The fused ensemble model for nodes achieved an AUC of 0.910 with a sensitivity of 100.0% and specificity of 68.0%, which also trended towards improvement when compared to the clinical model (AUC =0.865, P=0.080).
The fused ensemble EBM model achieved high discriminatory ability at predicting LF for head and neck cancer in independent primary and nodal structures. Although an additive benefit of delta radiomics over clinical factors could not be proven, the results trended towards improvement with the fused ensemble model, which are promising and worthy of prospective investigation in a larger cohort.
头颈部癌放化疗(CRT)后的局部失败(LF)与总体生存率低相关。如果机器学习技术能够根据基线和治疗期间的影像学检查对有治疗失败风险的患者进行分层,这样的模型可以促进适应性反应方法的应用,以加强、减弱或转换治疗方案。
纳入在单一机构接受确定性放疗的头颈部癌患者的1:2回顾性病例对照队列,这些患者在原发灶或淋巴结结构处局部、野内失败。使用PyRadiomics从放疗第1次和第21次(增量)时的基线CT和CBCT扫描中提取放射组学特征,并通过以下标准进行选择:可重复性(组内相关系数≥0.95)、冗余性[最大相关性和最小冗余性(mRMR)]和信息量[递归特征消除(RFE)]。使用可解释增强机器(EBM)分类器创建预测原发灶或淋巴结LF的单独模型,并进行5折交叉验证,用于(I)仅临床因素、(II)仅放射组学因素(CT1和增量特征)以及(III)融合模型(临床+放射组学)。进行25次迭代,并采用并行集成设计对预测分数进行平均。使用配对样本检验比较模型之间的受试者操作特征曲线。
原发灶的融合集成模型(使用临床、CT1和增量特征)在最大约登J统计量时的AUC为0.871,灵敏度为78.3%,特异性为90.9%。与仅临床因素的集成模型相比(AUC =0.788,P=0.134),融合集成模型有改善趋势,但与放射组学集成模型相比(AUC =0.770,P=0.017)达到显著差异。淋巴结的融合集成模型AUC为0.910,灵敏度为100.0%,特异性为68.0%,与临床模型相比(AUC =0.865,P=0.080)也有改善趋势。
融合集成EBM模型在预测头颈部癌独立原发灶和淋巴结结构的LF方面具有较高的判别能力。虽然增量放射组学相对于临床因素的附加益处尚未得到证实,但融合集成模型的结果有改善趋势,很有前景,值得在更大队列中进行前瞻性研究。