基于 LUNG1 数据集的放射组学和深度学习方法预测 2 年总生存率。
Radiomics and deep learning methods for the prediction of 2-year overall survival in LUNG1 dataset.
机构信息
Physics and Astronomy Department "Galileo Galilei", University of Padova, Via Marzolo 8, 35131, Padua, Italy.
INFN, Sezione di Padova, Via Marzolo 8, 35131, Padua, Italy.
出版信息
Sci Rep. 2022 Aug 19;12(1):14132. doi: 10.1038/s41598-022-18085-z.
In this study, we tested and compared radiomics and deep learning-based approaches on the public LUNG1 dataset, for the prediction of 2-year overall survival (OS) in non-small cell lung cancer patients. Radiomic features were extracted from the gross tumor volume using Pyradiomics, while deep features were extracted from bi-dimensional tumor slices by convolutional autoencoder. Both radiomic and deep features were fed to 24 different pipelines formed by the combination of four feature selection/reduction methods and six classifiers. Direct classification through convolutional neural networks (CNNs) was also performed. Each approach was investigated with and without the inclusion of clinical parameters. The maximum area under the receiver operating characteristic on the test set improved from 0.59, obtained for the baseline clinical model, to 0.67 ± 0.03, 0.63 ± 0.03 and 0.67 ± 0.02 for models based on radiomic features, deep features, and their combination, and to 0.64 ± 0.04 for direct CNN classification. Despite the high number of pipelines and approaches tested, results were comparable and in line with previous works, hence confirming that it is challenging to extract further imaging-based information from the LUNG1 dataset for the prediction of 2-year OS.
在这项研究中,我们在公共 LUNG1 数据集上测试和比较了基于放射组学和深度学习的方法,用于预测非小细胞肺癌患者的 2 年总生存期 (OS)。使用 Pyradiomics 从大体肿瘤体积中提取放射组学特征,而使用卷积自动编码器从二维肿瘤切片中提取深度特征。将放射组学和深度特征分别输入由四种特征选择/降维方法和六种分类器组合而成的 24 个不同的管道。还通过卷积神经网络 (CNN) 进行直接分类。每种方法都在包含和不包含临床参数的情况下进行了研究。在测试集上,最大接收器工作特征曲线下面积从基线临床模型的 0.59 提高到 0.67±0.03、0.63±0.03 和 0.67±0.02,分别基于放射组学特征、深度特征及其组合的模型,以及 0.64±0.04 用于直接 CNN 分类。尽管测试了大量的管道和方法,但结果是可比的,与之前的工作一致,因此证实从 LUNG1 数据集提取更多基于成像的信息来预测 2 年 OS 具有挑战性。