Institute of Medical Imaging Engineering, University of Shanghai for Science and Technology, Shanghai 200093, China.
Department of Radiology, Shanghai Ninth People's Hospital, Shanghai Jiao Tong University School ofMedicine, Shanghai 200011, China.
J Xray Sci Technol. 2022;30(6):1155-1168. doi: 10.3233/XST-221224.
To investigate the value of a CT-based radiomics model in identification of Crohn's disease (CD) active phase and remission phase.
CT images of 101 patients diagnosed with CD were retrospectively collected, which included 60 patients in active phase and 41 patients in remission phase. These patients were randomly divided into training group and test group at a ratio of 7 : 3. First, the lesion areas were manually delineated by the physician. Meanwhile, radiomics features were extracted from each lesion. Next, the features were selected by t-test and the least absolute shrinkage and selection operator regression algorithm. Then, several machine learning models including random forest (RF), extreme gradient boosting (XGBoost), support vector machine (SVM), logistic regression (LR) and K-nearest neighbor (KNN) algorithms were used to construct CD activity classification models respectively. Finally, the soft-voting mechanism was used to integrate algorithms with better effects to perform two classifications of data, and the receiver operating characteristic curves were applied to evaluate the diagnostic value of the models.
Both on the training set and the test set, AUC of the five machine learning classification models reached 0.85 or more. The ensemble soft-voting classifier obtained by using the combination of SVM, LR and KNN could better distinguish active CD from CD remission. For the test set, AUC was 0.938, and accuracy, sensitivity, and specificity were 0.903, 0.911, and 0.892, respectively.
This study demonstrated that the established radiomics model could objectively and effectively diagnose CD activity. The integrated approach has better diagnostic performance.
探讨基于 CT 的放射组学模型在识别克罗恩病(CD)活动期和缓解期中的价值。
回顾性收集了 101 例经 CD 诊断的患者的 CT 图像,其中 60 例为活动期,41 例为缓解期。这些患者按 7:3 的比例随机分为训练组和测试组。首先,由医师手动勾画病变区域,同时从每个病变中提取放射组学特征。然后,通过 t 检验和最小绝对收缩和选择算子回归算法对特征进行选择。接着,使用几种机器学习模型,包括随机森林(RF)、极端梯度提升(XGBoost)、支持向量机(SVM)、逻辑回归(LR)和 K 最近邻(KNN)算法,分别构建 CD 活动分类模型。最后,使用软投票机制将效果较好的算法集成起来对数据进行两次分类,并应用受试者工作特征曲线评估模型的诊断价值。
在训练集和测试集上,五种机器学习分类模型的 AUC 均达到 0.85 或更高。使用 SVM、LR 和 KNN 组合的集成软投票分类器可以更好地区分活动期 CD 与 CD 缓解期。对于测试集,AUC 为 0.938,准确率、灵敏度和特异度分别为 0.903、0.911 和 0.892。
本研究表明,所建立的放射组学模型能够客观有效地诊断 CD 活动期,集成方法具有更好的诊断性能。