Noortman Wyanne A, Vriens Dennis, Bussink Johan, Meijer Tineke W H, Aarntzen Erik H J G, Deroose Christophe M, Lhommel Renaud, Aide Nicolas, Le Tourneau Christophe, de Koster Elizabeth J, Oyen Wim J G, Triemstra Lianne, Ruurda Jelle P, Vegt Erik, de Geus-Oei Lioe-Fee, van Velden Floris H P
Department of Radiology, Section of Nuclear Medicine, Leiden University Medical Center, Leiden, The Netherlands.
Department of Medical Oncology, University Medical Center Groningen, Groningen, The Netherlands.
Eur Radiol. 2025 May 7. doi: 10.1007/s00330-025-11637-7.
The aim of this study was to map multicollinearity of the radiomic feature set in five independent [F]FDG-PET cohorts with different tumour types and identify generalizable non-redundant features.
Five [F]FDG-PET radiomic cohorts were analysed: non-small cell lung carcinomas (N = 35), pheochromocytomas and paragangliomas (N = 40), head and neck squamous cell carcinomas (N = 54), [F]FDG-positive thyroid nodules with indeterminate cytology (N = 84), and gastric carcinomas (N = 206). Lesions were delineated, and 105 radiomic features were extracted using PyRradiomics. In every cohort, Spearman's rank correlation coefficient (ρ) matrices of features were calculated to determine which features showed (very) strong (ρ > 0.7 and ρ > 0.9) correlations with any other feature in all five cohorts. Cluster analysis of an averaged correlation matrix for all cohorts was performed at a threshold of ρ = 0.7 and ρ = 0.9. For each cluster, a representative, non-redundant feature was selected.
Seventy-two and 90 out of 105 features showed a (very) strong correlation with another feature in the correlation matrix in all five cohorts. Cluster analysis resulted in 35 and 15 non-redundant features at thresholds of ρ = 0.9 and ρ = 0.7, including 6 and 3 shape features, 4 and 2 intensity features, and 25 and 10 texture features, respectively. Seventy or 90 redundant features could be omitted at these thresholds, respectively.
At least two-thirds of the radiomic feature set could be omitted because of strong multicollinearity in multiple independent cohorts. More redundant features could be identified using a less conservative threshold. Future research should indicate whether multicollinearity of the radiomic feature set is similar for other radiopharmaceuticals and imaging modalities.
Question Radiomic feature sets contain many strongly correlating features, which results in statistical challenges. Findings Analysis of the correlation matrices showed that the same radiomic features were strongly correlated in five independent [F]FDG-PET cohorts with different tumour types. Clinical relevance At least two-thirds of the radiomic feature set could be omitted, because of strong multicollinearity. More redundant features could be identified using a less conservative threshold.
本研究的目的是在五个具有不同肿瘤类型的独立[F]FDG-PET队列中描绘放射组学特征集的多重共线性,并识别可推广的非冗余特征。
分析了五个[F]FDG-PET放射组学队列:非小细胞肺癌(N = 35)、嗜铬细胞瘤和副神经节瘤(N = 40)、头颈部鳞状细胞癌(N = 54)、细胞学检查结果不确定的[F]FDG阳性甲状腺结节(N = 84)和胃癌(N = 206)。勾勒出病变轮廓,并使用PyRradiomics提取105个放射组学特征。在每个队列中,计算特征的Spearman等级相关系数(ρ)矩阵,以确定哪些特征在所有五个队列中与任何其他特征显示出(非常)强(ρ>0.7和ρ>0.9)的相关性。在ρ = 0.7和ρ = 0.9的阈值下,对所有队列的平均相关矩阵进行聚类分析。对于每个聚类,选择一个代表性的、非冗余的特征。
105个特征中的72个和90个在所有五个队列的相关矩阵中与另一个特征显示出(非常)强的相关性。聚类分析在ρ = 0.9和ρ = 0.7的阈值下分别产生了35个和15个非冗余特征,包括6个和3个形状特征、4个和2个强度特征以及25个和10个纹理特征。在这些阈值下,分别可以省略70个或90个冗余特征。
由于多个独立队列中存在强多重共线性,放射组学特征集的至少三分之二可以省略。使用不太保守的阈值可以识别更多冗余特征。未来的研究应表明放射组学特征集的多重共线性对于其他放射性药物和成像模态是否相似。
问题放射组学特征集包含许多强相关特征,这带来了统计挑战。发现相关矩阵分析表明,相同的放射组学特征在五个具有不同肿瘤类型的独立[F]FDG-PET队列中具有强相关性。临床意义由于强多重共线性,放射组学特征集的至少三分之二可以省略。使用不太保守的阈值可以识别更多冗余特征。