Makimoto Kalysta, Au Ryan, Moslemi Amir, Hogg James C, Bourbeau Jean, Tan Wan C, Kirby Miranda
Toronto Metropolitan University, Kerr Hall South Bldg. Room - KHS-344, 350 Victoria St., Toronto, M5B 2K3, Ontario, Canada.
Western University, London, Ontario, Canada.
Acad Radiol. 2023 May;30(5):900-910. doi: 10.1016/j.acra.2022.07.016. Epub 2022 Aug 12.
Texture-based radiomics analysis of lung computed tomography (CT) images has been shown to predict chronic obstructive pulmonary disease (COPD) status using machine learning models. However, various approaches are used and it is unclear which provides the best performance.
To compare the most commonly used feature selection and classification methods and determine the optimal models for classifying COPD status in a mild, population-based COPD cohort.
CT images from the multi-center Canadian Cohort Obstructive Lung Disease (CanCOLD) study were pre-processed by resampling the image to a 1mm isotropic voxel volume, segmenting the lung and removing the airways (VIDA Diagnostics Inc.), and applying a threshold of -1000HU-to-0HU. A total of 95 texture features were then extracted from each CT image. Combinations of 17 feature selection methods and 9 classifiers were tested and evaluated. In addition, the role of data cleaning (outlier removal and highly correlated feature removal) was evaluated. The area under the curve (AUC) from the receiver operating characteristic curve was used to evaluate model performance.
A total of 1204 participants were evaluated (n = 602 no COPD, n = 602 COPD). There were no significant differences between the groups for female sex (no COPD = 46.3%; COPD = 38.5%; p = 0.77), or body mass index (no COPD = 27.7 kg/m; COPD = 27.4 kg/m; p = 0.21). The highest AUC value for predicting COPD status (AUC = 0.78 [0.73, 0.84]) was obtained following data cleaning and feature selection using Elastic Net with the Linear-SVM classifier.
In a population-based cohort, the optimal combination for radiomics-based prediction of COPD status was Elastic Net as the feature selection method and Linear-SVM as the classifier.
基于纹理的肺部计算机断层扫描(CT)图像放射组学分析已被证明可使用机器学习模型预测慢性阻塞性肺疾病(COPD)状态。然而,目前使用了各种方法,尚不清楚哪种方法性能最佳。
比较最常用的特征选择和分类方法,并确定在以人群为基础的轻度COPD队列中对COPD状态进行分类的最佳模型。
对多中心加拿大队列阻塞性肺病(CanCOLD)研究的CT图像进行预处理,将图像重采样为1mm各向同性体素体积,分割肺部并去除气道(VIDA诊断公司),并应用-1000HU至0HU的阈值。然后从每个CT图像中提取总共95个纹理特征。测试并评估了17种特征选择方法和9种分类器的组合。此外,还评估了数据清理(去除异常值和高度相关特征)的作用。使用受试者工作特征曲线的曲线下面积(AUC)来评估模型性能。
共评估了1204名参与者(n = 602名无COPD,n = 602名COPD)。两组在女性性别(无COPD = 46.3%;COPD = 38.5%;p = 0.77)或体重指数(无COPD = 27.7 kg/m;COPD = 27.4 kg/m;p = 0.21)方面无显著差异。在使用弹性网络和线性支持向量机分类器进行数据清理和特征选择后,预测COPD状态的最高AUC值为0.78(0.73,0.84)。
在以人群为基础的队列中,基于放射组学预测COPD状态的最佳组合是使用弹性网络作为特征选择方法,线性支持向量机作为分类器。