Teng Xinzhi, Zhang Jiang, Ma Zongrui, Zhang Yuanpeng, Lam Saikit, Li Wen, Xiao Haonan, Li Tian, Li Bing, Zhou Ta, Ren Ge, Lee Francis Kar-Ho, Au Kwok-Hung, Lee Victor Ho-Fun, Chang Amy Tien Yee, Cai Jing
Department of Health Technology and Informatics, The Hong Kong Polytechnic University, Hong Kong, Hong Kong SAR, China.
Department of Clinical Oncology, Queen Elizabeth Hospital, Hong Kong, Hong Kong SAR, China.
Front Oncol. 2022 Oct 14;12:974467. doi: 10.3389/fonc.2022.974467. eCollection 2022.
Using high robust radiomic features in modeling is recommended, yet its impact on radiomic model is unclear. This study evaluated the radiomic model's robustness and generalizability after screening out low-robust features before radiomic modeling. The results were validated with four datasets and two clinically relevant tasks.
A total of 1,419 head-and-neck cancer patients' computed tomography images, gross tumor volume segmentation, and clinically relevant outcomes (distant metastasis and local-regional recurrence) were collected from four publicly available datasets. The perturbation method was implemented to simulate images, and the radiomic feature robustness was quantified using intra-class correlation of coefficient (ICC). Three radiomic models were built using all features (ICC > 0), good-robust features (ICC > 0.75), and excellent-robust features (ICC > 0.95), respectively. A filter-based feature selection and Ridge classification method were used to construct the radiomic models. Model performance was assessed with both robustness and generalizability. The robustness of the model was evaluated by the ICC, and the generalizability of the model was quantified by the train-test difference of Area Under the Receiver Operating Characteristic Curve (AUC).
The average model robustness ICC improved significantly from 0.65 to 0.78 (P< 0.0001) using good-robust features and to 0.91 (P< 0.0001) using excellent-robust features. Model generalizability also showed a substantial increase, as a closer gap between training and testing AUC was observed where the mean train-test AUC difference was reduced from 0.21 to 0.18 (P< 0.001) in good-robust features and to 0.12 (P< 0.0001) in excellent-robust features. Furthermore, good-robust features yielded the best average AUC in the unseen datasets of 0.58 (P< 0.001) over four datasets and clinical outcomes.
Including robust only features in radiomic modeling significantly improves model robustness and generalizability in unseen datasets. Yet, the robustness of radiomic model has to be verified despite building with robust radiomic features, and tightly restricted feature robustness may prevent the optimal model performance in the unseen dataset as it may lower the discrimination power of the model.
建议在建模中使用高度稳健的放射组学特征,但其对放射组学模型的影响尚不清楚。本研究在放射组学建模前筛选出低稳健性特征后,评估了放射组学模型的稳健性和泛化性。结果在四个数据集和两项临床相关任务中得到验证。
从四个公开可用的数据集中收集了总共1419例头颈癌患者的计算机断层扫描图像、大体肿瘤体积分割以及临床相关结局(远处转移和局部区域复发)。采用扰动方法模拟图像,并使用类内相关系数(ICC)对放射组学特征的稳健性进行量化。分别使用所有特征(ICC>0)、良好稳健性特征(ICC>0.75)和优秀稳健性特征(ICC>0.95)构建了三个放射组学模型。使用基于滤波器的特征选择和岭分类方法构建放射组学模型。通过稳健性和泛化性评估模型性能。通过ICC评估模型的稳健性,通过受试者操作特征曲线下面积(AUC)的训练-测试差异量化模型的泛化性。
使用良好稳健性特征时,模型稳健性ICC平均值从0.65显著提高到0.78(P<0.0001),使用优秀稳健性特征时提高到0.91(P<0.0001)。模型泛化性也显著提高,观察到训练和测试AUC之间的差距缩小,良好稳健性特征的平均训练-测试AUC差异从0.21降至0.18(P<0.001),优秀稳健性特征的差异降至0.12(P<0.0001)。此外,在四个数据集和临床结局的未见过的数据集中,良好稳健性特征产生的平均AUC最佳,为0.58(P<0.001)。
在放射组学建模中仅纳入稳健特征可显著提高未见过的数据集中模型的稳健性和泛化性。然而,尽管使用稳健的放射组学特征构建模型,但放射组学模型的稳健性仍需验证,严格限制特征稳健性可能会妨碍未见过的数据集中的最佳模型性能,因为这可能会降低模型的辨别力。