Suppr超能文献

非小细胞肺癌组织病理学亚型的表型分析:影像组学有多大益处?

Phenotyping the Histopathological Subtypes of Non-Small-Cell Lung Carcinoma: How Beneficial Is Radiomics?

作者信息

Pasini Giovanni, Stefano Alessandro, Russo Giorgio, Comelli Albert, Marinozzi Franco, Bini Fabiano

机构信息

Department of Mechanical and Aerospace Engineering, Sapienza University of Rome, Eudossiana 18, 00184 Rome, Italy.

Institute of Molecular Bioimaging and Physiology, National Research Council (IBFM-CNR), Contrada, Pietrapollastra-Pisciotto, 90015 Cefalù, Italy.

出版信息

Diagnostics (Basel). 2023 Mar 18;13(6):1167. doi: 10.3390/diagnostics13061167.

Abstract

The aim of this study was to investigate the usefulness of radiomics in the absence of well-defined standard guidelines. Specifically, we extracted radiomics features from multicenter computed tomography (CT) images to differentiate between the four histopathological subtypes of non-small-cell lung carcinoma (NSCLC). In addition, the results that varied with the radiomics model were compared. We investigated the presence of the batch effects and the impact of feature harmonization on the models' performance. Moreover, the question on how the training dataset composition influenced the selected feature subsets and, consequently, the model's performance was also investigated. Therefore, through combining data from the two publicly available datasets, this study involves a total of 152 squamous cell carcinoma (SCC), 106 large cell carcinoma (LCC), 150 adenocarcinoma (ADC), and 58 no other specified (NOS). Through the matRadiomics tool, which is an example of Image Biomarker Standardization Initiative (IBSI) compliant software, 1781 radiomics features were extracted from each of the malignant lesions that were identified in CT images. After batch analysis and feature harmonization, which were based on the ComBat tool and were integrated in matRadiomics, the datasets (the harmonized and the non-harmonized) were given as an input to a machine learning modeling pipeline. The following steps were articulated: (i) training-set/test-set splitting (80/20); (ii) a Kruskal-Wallis analysis and LASSO linear regression for the feature selection; (iii) model training; (iv) a model validation and hyperparameter optimization; and (v) model testing. Model optimization consisted of a 5-fold cross-validated Bayesian optimization, repeated ten times (inner loop). The whole pipeline was repeated 10 times (outer loop) with six different machine learning classification algorithms. Moreover, the stability of the feature selection was evaluated. Results showed that the batch effects were present even if the voxels were resampled to an isotropic form and whether feature harmonization correctly removed them, even though the models' performances decreased. Moreover, the results showed that a low accuracy (61.41%) was reached when differentiating between the four subtypes, even though a high average area under curve (AUC) was reached (0.831). Further, a NOS subtype was classified as almost completely correct (true positive rate ~90%). The accuracy increased (77.25%) when only the SCC and ADC subtypes were considered, as well as when a high AUC (0.821) was obtained-although harmonization decreased the accuracy to 58%. Moreover, the features that contributed the most to models' performance were those extracted from wavelet decomposed and Laplacian of Gaussian (LoG) filtered images and they belonged to the texture feature class.. In conclusion, we showed that our multicenter data were affected by batch effects, that they could significantly alter the models' performance, and that feature harmonization correctly removed them. Although wavelet features seemed to be the most informative features, an absolute subset could not be identified since it changed depending on the training/testing splitting. Moreover, performance was influenced by the chosen dataset and by the machine learning methods, which could reach a high accuracy in binary classification tasks, but could underperform in multiclass problems. It is, therefore, essential that the scientific community propose a more systematic radiomics approach, focusing on multicenter studies, with clear and solid guidelines to facilitate the translation of radiomics to clinical practice.

摘要

本研究的目的是在缺乏明确标准指南的情况下,探讨放射组学的实用性。具体而言,我们从多中心计算机断层扫描(CT)图像中提取放射组学特征,以区分非小细胞肺癌(NSCLC)的四种组织病理学亚型。此外,还比较了不同放射组学模型的结果。我们研究了批次效应的存在以及特征归一化对模型性能的影响。此外,还研究了训练数据集组成如何影响所选特征子集,进而影响模型性能的问题。因此,通过合并两个公开可用数据集的数据,本研究共纳入了152例鳞状细胞癌(SCC)、106例大细胞癌(LCC)、150例腺癌(ADC)和58例未另作说明(NOS)的病例。通过matRadiomics工具(这是一个符合图像生物标志物标准化倡议(IBSI)的软件示例),从CT图像中识别出的每个恶性病变中提取了1781个放射组学特征。在基于ComBat工具并集成到matRadiomics中的批次分析和特征归一化之后,将数据集(归一化和未归一化的)作为输入提供给机器学习建模流程。具体步骤如下:(i)训练集/测试集划分(80/20);(ii)进行Kruskal-Wallis分析和LASSO线性回归以进行特征选择;(iii)模型训练;(iv)模型验证和超参数优化;以及(v)模型测试。模型优化包括5折交叉验证的贝叶斯优化,重复十次(内循环)。整个流程使用六种不同的机器学习分类算法重复10次(外循环)。此外,还评估了特征选择的稳定性。结果表明,即使将体素重新采样为各向同性形式,批次效应仍然存在,并且尽管模型性能有所下降,但特征归一化是否正确消除了这些效应。此外,结果表明,在区分四种亚型时,即使达到了较高的平均曲线下面积(AUC)(0.831),准确率仍较低(61.41%)。此外,NOS亚型的分类几乎完全正确(真阳性率约为90%)。当仅考虑SCC和ADC亚型时以及获得较高的AUC(0.821)时,准确率有所提高(77.25%),尽管归一化使准确率降至58%。此外,对模型性能贡献最大的特征是从小波分解和高斯拉普拉斯(LoG)滤波图像中提取的,它们属于纹理特征类别。总之,我们表明我们的多中心数据受到批次效应的影响,这些效应会显著改变模型的性能,并且特征归一化正确地消除了它们。尽管小波特征似乎是最具信息量的特征,但由于它会根据训练/测试划分而变化,因此无法确定绝对的特征子集。此外,性能受所选数据集和机器学习方法的影响,这些方法在二元分类任务中可以达到较高的准确率,但在多类问题中可能表现不佳。因此,科学界必须提出一种更系统的放射组学方法,专注于多中心研究,并制定清晰、可靠的指南,以促进放射组学向临床实践的转化。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7858/10046953/b22a66a6b1a2/diagnostics-13-01167-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验