Suppr超能文献

基于影像组学的用于区分胶质母细胞瘤与孤立性脑转移瘤的可解释机器学习模型

Interpretable Machine Learning Models for Differentiating Glioblastoma From Solitary Brain Metastasis Using Radiomics.

作者信息

Xia Xueming, Wu Wenjun, Tan Qiaoyue, Gou Qiheng

机构信息

Division of Head & Neck Tumor Multimodality Treatment, Cancer Center, West China Hospital, Sichuan University, Chengdu, China (X.X., Q.G.).

Department of Radiology, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China (W.W.).

出版信息

Acad Radiol. 2025 May 27. doi: 10.1016/j.acra.2025.05.016.

Abstract

PURPOSE

To develop and validate interpretable machine learning models for differentiating glioblastoma (GB) from solitary brain metastasis (SBM) using radiomics features from contrast-enhanced T1-weighted MRI (CE-T1WI), and to compare the impact of low-order and high-order features on model performance.

METHODS

A cohort of 434 patients with histopathologically confirmed GB (226 patients) and SBM (208 patients) was retrospectively analyzed. Radiomic features were derived from CE-T1WI, with feature selection conducted through minimum redundancy maximum relevance and least absolute shrinkage and selection operator regression. Machine learning models, including GradientBoost and lightGBM (LGBM), were trained using low-order and high-order features. The performance of the models was assessed through receiver operating characteristic analysis and computation of the area under the curve, along with other indicators, including accuracy, specificity, and sensitivity. SHapley Additive Explanations (SHAP) analysis is used to measure the influence of each feature on the model's predictions.

RESULTS

The performances of various machine learning models on both the training and validation datasets were notably different. For the training group, the LGBM, CatBoost, multilayer perceptron (MLP), and GradientBoost models achieved the highest AUC scores, all exceeding 0.9, demonstrating strong discriminative power. The LGBM model exhibited the best stability, with a minimal AUC difference of only 0.005 between the training and test sets, suggesting strong generalizability. Among the validation group results, the GradientBoost classifier achieved the maximum AUC of 0.927, closely followed by random forest at 0.925. GradientBoost also demonstrated high sensitivity (0.911) and negative predictive value (NPV, 0.889), effectively identifying true positives. The LGBM model showed the highest test accuracy (86.2%) and performed excellently in terms of sensitivity (0.911), NPV (0.895), and positive predictive value (PPV, 0.837). The models utilizing high-order features outperformed those based on low-order features in all the metrics. SHAP analysis further enhances model interpretability, providing insights into feature importance and contributions to classification decisions.

CONCLUSION

Machine learning techniques based on radiomics can effectively distinguish GB from SBM, with gradient boosting tree-based models such as LGBMs demonstrating superior performance. High-order features significantly improve model accuracy and robustness. SHAP technology enhances the interpretability and transparency of models for distinguishing brain tumors, providing intuitive visualization of the contribution of radiomic features to classification.

摘要

目的

利用对比增强T1加权磁共振成像(CE-T1WI)的影像组学特征,开发并验证用于区分胶质母细胞瘤(GB)和孤立性脑转移瘤(SBM)的可解释机器学习模型,并比较低阶特征和高阶特征对模型性能的影响。

方法

回顾性分析了一组434例经组织病理学确诊的GB患者(226例)和SBM患者(208例)。从CE-T1WI中提取影像组学特征,并通过最小冗余最大相关和最小绝对收缩与选择算子回归进行特征选择。使用低阶和高阶特征训练包括梯度提升(GradientBoost)和轻量级梯度提升机(lightGBM,LGBM)在内的机器学习模型。通过受试者工作特征分析和曲线下面积计算以及其他指标(包括准确性、特异性和敏感性)评估模型的性能。使用SHapley加性解释(SHAP)分析来衡量每个特征对模型预测的影响。

结果

各种机器学习模型在训练集和验证集上的表现存在显著差异。对于训练组,LGBM、CatBoost、多层感知器(MLP)和GradientBoost模型获得了最高的AUC分数,均超过0.9,显示出强大的判别能力。LGBM模型表现出最佳的稳定性,训练集和测试集之间的AUC差异最小,仅为0.005,表明具有很强的泛化能力。在验证组结果中,GradientBoost分类器的最大AUC为0.927,随机森林紧随其后,为0.925。GradientBoost还表现出高敏感性(0.911)和阴性预测值(NPV,0.889),能有效识别真阳性。LGBM模型显示出最高的测试准确性(86.2%),在敏感性(0.911)、NPV(0.895)和阳性预测值(PPV,0.837)方面表现出色。在所有指标上,利用高阶特征的模型优于基于低阶特征的模型。SHAP分析进一步增强了模型的可解释性,深入了解了特征重要性以及对分类决策的贡献。

结论

基于影像组学的机器学习技术可以有效区分GB和SBM,基于梯度提升树的模型(如LGBM)表现出卓越的性能。高阶特征显著提高了模型的准确性和稳健性。SHAP技术增强了区分脑肿瘤模型的可解释性和透明度,直观地显示了影像组学特征对分类的贡献。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验