Huchthausen Claire, Shi Menglin, de Sousa Gabriel L A, Colen Jonathan, Shelley Emery, Larner James, Janowski Einsley, Wijesooriya Krishni
Department of Physics, University of Virginia.
Department of Biomedical Engineering, Northwestern University.
ArXiv. 2025 Jan 15:arXiv:2412.16758v2.
Conventional methods for detecting lung cancer early are often qualitative and subject to interpretation. Radiomics provides quantitative characteristics of pulmonary nodules (PNs) in medical images, but variability in medical image acquisition is an obstacle to consistent clinical application of these quantitative features. Correcting radiomic features' dependency on acquisition parameters is problematic when combining data from benign and malignant PNs, as is necessary when the goal is to diagnose lung cancer, because acquisition effects may differ between them due to their biological differences.
We evaluated whether we must account for biological differences between benign and malignant PNs when correcting the dependency of radiomic features on acquisition parameters, and we compared methods of doing this using ComBat harmonization.
This study used a dataset of 567 clinical chest CT scans containing both malignant and benign PNs. Scans were grouped as benign, malignant, or lung cancer screening (mixed benign and malignant). Preprocessing and feature extraction from ROIs were performed using PyRadiomics. Optimized Permutation Nested ComBat harmonization was performed on extracted features to account for variability in four imaging protocols: contrast enhancement, scanner manufacturer, acquisition voltage, focal spot size. Three methods were compared: harmonizing all data collectively in the standard manner, harmonizing all data with a covariate to preserve distinctions between subgroups, and harmonizing subgroups separately. A significant ( ≤ 0.05) Kruskal-Wallis test determined whether harmonization removed a feature's dependency on an acquisition parameter. A LASSO-SVM pipeline was trained using acquisition-independent radiomic features to predict whether PNs were malignant or benign. To evaluate the predictive information made available by each harmonization method, the trained harmonization estimators and predictive model were applied to a corresponding unseen test set. Harmonization and predictive performance metrics were assessed over 10 trials of 5-fold cross validation.
Kruskal-Wallis defined an average 2.1% of features (95% CI: 1.9-2.4%) as acquisition-independent when data were harmonized collectively, 27.3% of features (95% CI: 25.7-28.9%) as acquisition-independent when harmonized with a covariate, and 90.9% of features (95% CI: 90.4-91.5%) as acquisition-independent when harmonized separately. LASSO-SVM models trained on data harmonized separately or with a covariate had higher ROC-AUC for lung cancer screening scans than models trained on data harmonized without distinction between benign and malignant tissues (Delong test, Holm-Bonferroni adjusted ≤ 0.05). There was not a conclusive difference in ROC-AUC between models trained on data harmonized separately and models trained on data harmonized with a covariate.
Radiomic features of benign and malignant PNs require different corrective transformations to recover acquisition-independent distributions. This can be done using separate harmonization or harmonization with a covariate. Separate harmonization enabled the greatest number of predictive features to be used in a machine learning model to retrospectively detect lung cancer. Features harmonized separately and features harmonized with a covariate enabled predictive models to achieve similar performance on lung cancer screening scans.
传统的早期肺癌检测方法通常是定性的,且易受主观解读影响。放射组学可提供医学图像中肺结节(PNs)的定量特征,但医学图像采集的变异性是这些定量特征在临床中一致应用的障碍。当结合来自良性和恶性PNs的数据时,校正放射组学特征对采集参数的依赖性存在问题,而在旨在诊断肺癌时这是必要的,因为由于它们的生物学差异,它们之间的采集效应可能不同。
我们评估了在校正放射组学特征对采集参数的依赖性时是否必须考虑良性和恶性PNs之间的生物学差异,并使用ComBat归一化比较了实现此目的的方法。
本研究使用了包含恶性和良性PNs的567例临床胸部CT扫描数据集。扫描被分为良性、恶性或肺癌筛查(良性和恶性混合)。使用PyRadiomics对感兴趣区域进行预处理和特征提取。对提取的特征进行优化排列嵌套ComBat归一化,以考虑四种成像协议中的变异性:对比增强、扫描仪制造商、采集电压、焦点尺寸。比较了三种方法:以标准方式统一所有数据、使用协变量统一所有数据以保留亚组之间的差异、分别统一亚组。显著(≤0.05)的Kruskal-Wallis检验确定归一化是否消除了特征对采集参数的依赖性。使用与采集无关的放射组学特征训练LASSO-SVM管道,以预测PNs是恶性还是良性。为了评估每种归一化方法提供的预测信息,将训练好的归一化估计器和预测模型应用于相应的未见过的测试集。在10次5折交叉验证试验中评估归一化和预测性能指标。
当数据统一处理时,Kruskal-Wallis将平均2.1%的特征(95%CI:1.9-2.4%)定义为与采集无关;当使用协变量统一时,27.3%的特征(95%CI:25.7-28.9%)为与采集无关;当分别统一时,90.9%的特征(95%CI:90.4-91.5%)为与采集无关。在肺癌筛查扫描中,基于分别统一或使用协变量统一的数据训练的LASSO-SVM模型比基于未区分良性和恶性组织统一的数据训练的模型具有更高的ROC-AUC(Delong检验,Holm-Bonferroni校正≤0.05)。基于分别统一的数据训练的模型和基于使用协变量统一的数据训练的模型在ROC-AUC上没有决定性差异。
良性和恶性PNs的放射组学特征需要不同的校正变换以恢复与采集无关的分布。这可以通过分别归一化或使用协变量归一化来实现。分别归一化使得在机器学习模型中能够使用最多数量的预测特征来回顾性检测肺癌。分别归一化的特征和使用协变量归一化的特征使预测模型在肺癌筛查扫描中具有相似的性能。