利用机器学习开发用于脑转移瘤 CT 放射组学分类的堆叠集成学习模型。
Using machine learning to develop a stacking ensemble learning model for the CT radiomics classification of brain metastases.
机构信息
Department of Radiotherapy, The Second Affiliated Hospital of Nanchang Medical College, Jiangxi Clinical Research Center for Cancer, Jiangxi Cancer Hospital, Nanchang, 330029, China.
School of Nursing, Southwest Medical University, Luzhou, 646000, China.
出版信息
Sci Rep. 2024 Nov 19;14(1):28575. doi: 10.1038/s41598-024-80210-x.
The objective of this study was to explore the potential of machine-learning techniques in the automatic identification and classification of brain metastases from a radiomic perspective, aiming to improve the accuracy of tumor volume assessment for radiotherapy. By using various machine-learning algorithms, including random forest, support vector machine, gradient boosting machine, XGBoost, decision tree, artificial neural network, k-nearest neighbors, LightGBM, and CatBoost algorithms, a stacking ensemble model was developed to classify gross tumor volume (GTV), brainstem, and normal brain tissue based on radiomic features. Multiple evaluation metrics, including the specificity, sensitivity, negative predictive value, positive predictive value, accuracy, Matthews correlation coefficient, and the Youden index, were used to assess the model's performance. The stacked ensemble model integrated the strengths of the nine base models and consistently outperformed individual base models in classifying GTV (area under the curve [AUC] = 0.928), brainstem (AUC = 0.932), and normal brain tissue (AUC = 0.942). Among the base models, the support vector machine model demonstrated the best performance in the three classifications (AUC = 0.922, 0.909, and 0.928). The higher performance of the stacked ensemble model highlighted the low performance of other models, including the decision tree (AUC = 0.709, 0.706, 0.804) and k-nearest neighbors (AUC = 0.721, 0.663, 0.729) models in certain contexts, such as when faced with high-dimensional feature spaces. While machine learning shows significant promise in medical image analysis, relying solely on a single model may lead to suboptimal results. By combining the strengths of various algorithms, the stacking ensemble model offers a better solution for the classification of brain metastases based on radiomic features.
本研究旨在探讨机器学习技术在从放射组学角度自动识别和分类脑转移瘤方面的潜力,旨在提高放疗中肿瘤体积评估的准确性。通过使用各种机器学习算法,包括随机森林、支持向量机、梯度提升机、XGBoost、决策树、人工神经网络、k-最近邻、LightGBM 和 CatBoost 算法,开发了一个堆叠集成模型,用于根据放射组学特征对大体肿瘤体积(GTV)、脑干和正常脑组织进行分类。使用了多种评估指标,包括特异性、敏感性、阴性预测值、阳性预测值、准确性、马修斯相关系数和约登指数,以评估模型的性能。堆叠集成模型集成了九个基础模型的优势,在分类 GTV(曲线下面积 [AUC] = 0.928)、脑干(AUC = 0.932)和正常脑组织(AUC = 0.942)方面始终优于单个基础模型。在基础模型中,支持向量机模型在这三种分类中的表现最好(AUC = 0.922、0.909 和 0.928)。堆叠集成模型的更高性能突出了其他模型的低性能,包括决策树(AUC = 0.709、0.706 和 0.804)和 k-最近邻(AUC = 0.721、0.663 和 0.729)模型在某些情况下,例如在面对高维特征空间时。虽然机器学习在医学图像分析中显示出巨大的潜力,但仅依赖单个模型可能会导致结果不理想。通过结合各种算法的优势,堆叠集成模型为基于放射组学特征的脑转移瘤分类提供了更好的解决方案。