Division of Nuclear Medicine and Molecular Imaging, Geneva University Hospital, Geneva, Switzerland.
Rajaie Cardiovascular Medical and Research Center, Iran University of Medical Science, Tehran, Iran.
Clin Oncol (R Coll Radiol). 2022 Feb;34(2):114-127. doi: 10.1016/j.clon.2021.11.014. Epub 2021 Dec 3.
AIMS: Despite the promising results achieved by radiomics prognostic models for various clinical applications, multiple challenges still need to be addressed. The two main limitations of radiomics prognostic models include information limitation owing to single imaging modalities and the selection of optimum machine learning and feature selection methods for the considered modality and clinical outcome. In this work, we applied several feature selection and machine learning methods to single-modality positron emission tomography (PET) and computed tomography (CT) and multimodality PET/CT fusion to identify the best combinations for different radiomics modalities towards overall survival prediction in non-small cell lung cancer patients. MATERIALS AND METHODS: A PET/CT dataset from The Cancer Imaging Archive, including subjects from two independent institutions (87 and 95 patients), was used in this study. Each cohort was used once as training and once as a test, followed by averaging of the results. ComBat harmonisation was used to address the centre effect. In our proposed radiomics framework, apart from single-modality PET and CT models, multimodality radiomics models were developed using multilevel (feature and image levels) fusion. Two different methods were considered for the feature-level strategy, including concatenating PET and CT features into a single feature set and alternatively averaging them. For image-level fusion, we used three different fusion methods, namely wavelet fusion, guided filtering-based fusion and latent low-rank representation fusion. In the proposed prognostic modelling framework, combinations of four feature selection and seven machine learning methods were applied to all radiomics modalities (two single and five multimodalities), machine learning hyper-parameters were optimised and finally the models were evaluated in the test cohort with 1000 repetitions via bootstrapping. Feature selection and machine learning methods were selected as popular techniques in the literature, supported by open source software in the public domain and their ability to cope with continuous time-to-event survival data. Multifactor ANOVA was used to carry out variability analysis and the proportion of total variance explained by radiomics modality, feature selection and machine learning methods was calculated by a bias-corrected effect size estimate known as ω. RESULTS: Optimum feature selection and machine learning methods differed owing to the applied radiomics modality. However, minimum depth (MD) as feature selection and Lasso and Elastic-Net regularized generalized linear model (glmnet) as machine learning method had the highest average results. Results from the ANOVA test indicated that the variability that each factor (radiomics modality, feature selection and machine learning methods) introduces to the performance of models is case specific, i.e. variances differ regarding different radiomics modalities and fusion strategies. Overall, the greatest proportion of variance was explained by machine learning, except for models in feature-level fusion strategy. CONCLUSION: The identification of optimal feature selection and machine learning methods is a crucial step in developing sound and accurate radiomics risk models. Furthermore, optimum methods are case specific, differing due to the radiomics modality and fusion strategy used.
目的:尽管放射组学预后模型在各种临床应用中取得了有希望的结果,但仍需要解决多个挑战。放射组学预后模型的两个主要局限性包括由于单一成像方式导致的信息局限性,以及为考虑的方式和临床结果选择最佳的机器学习和特征选择方法。在这项工作中,我们应用了几种特征选择和机器学习方法,对单模态正电子发射断层扫描(PET)和计算机断层扫描(CT)以及多模态 PET/CT 融合进行了研究,以确定不同放射组学方式的最佳组合,从而对非小细胞肺癌患者的总生存进行预测。
材料和方法:本研究使用了来自癌症成像档案的 PET/CT 数据集,包括来自两个独立机构的受试者(87 名和 95 名患者)。每个队列都被一次用作训练集,一次用作测试集,然后对结果进行平均。使用 ComBat 均衡化来解决中心效应。在我们提出的放射组学框架中,除了单模态 PET 和 CT 模型外,还使用多层次(特征和图像层次)融合方法开发了多模态放射组学模型。对于特征级策略,考虑了两种不同的方法,包括将 PET 和 CT 特征串联成单个特征集和交替平均它们。对于图像级融合,我们使用了三种不同的融合方法,即小波融合、基于导向滤波的融合和潜在低秩表示融合。在所提出的预后建模框架中,将四种特征选择和七种机器学习方法的组合应用于所有放射组学模型(两种单模态和五种多模态),优化了机器学习超参数,最后通过 1000 次重复的引导抽样在测试队列中进行模型评估。特征选择和机器学习方法是根据文献中的流行技术选择的,这些方法得到了公共领域开源软件的支持,并且能够处理连续的生存时间数据。使用多因素方差分析进行变异性分析,并通过称为ω的偏置校正效应大小估计计算放射组学方式、特征选择和机器学习方法解释的总方差比例。
结果:由于应用的放射组学方式不同,最优的特征选择和机器学习方法也有所不同。然而,最小深度(MD)作为特征选择和 Lasso 和弹性网络正则化广义线性模型(glmnet)作为机器学习方法具有最高的平均结果。方差分析测试的结果表明,每个因素(放射组学方式、特征选择和机器学习方法)引入模型性能的可变性是特定于案例的,即方差因不同的放射组学方式和融合策略而不同。总体而言,除了特征级融合策略中的模型外,机器学习方法解释了最大比例的方差。
结论:确定最佳的特征选择和机器学习方法是开发可靠和准确的放射组学风险模型的关键步骤。此外,最优方法是特定于案例的,由于使用的放射组学方式和融合策略而有所不同。
Eur J Nucl Med Mol Imaging. 2021-2
Eur J Nucl Med Mol Imaging. 2025-6-18
Front Oncol. 2023-12-11
Transl Lung Cancer Res. 2023-9-28
Transl Lung Cancer Res. 2023-7-31