Zhang Li, Du Qinglin, Shen Mengyi, He Xin, Zhang Dingyi, Huang Xiaohua
Department of Radiology, Affiliated Hospital of North Sichuan Medical College, No 1 Maoyuan South Road, Nanchong, 637000, Sichuan, China.
Sci Rep. 2025 Apr 17;15(1):13318. doi: 10.1038/s41598-025-97247-1.
This study aimed to develop an interpretable machine learning model that accurately predicts Ki-67 expression in breast cancer (BC) patients using a combination of dynamic-contrast enhanced magnetic resonance imaging (DCE-MRI) radiomics and clinical-imaging features. A total of 195 BC patients, including 201 lesions, were enrolled retrospectively. These lesions were randomized into training and testing set (7:3). The correlation between clinical-imaging features and Ki-67 expression was analyzed via univariate analysis and binary logistic regression, leading to the development of a Clinical-imaging model. Radiomics features were extracted based on the early and delayed phases of DCE-MRI. These features were screened by Pearson correlation coefficient and recursive feature elimination (RFE). The logistic regression classifier was used to develop the Radiomics model. The clinical imaging and radiomics features were combined to form a Combined model. The Shapley Additive Explanation (SHAP) algorithm was employed to explain the optimal model, and the AUC was used to assess the model's performance. Ki-67 expression was markedly different from the internal enhancement pattern and necrosis among the imaging features. Compared to the Clinical-imaging model (AUC = 0.682), the AUCs of the Radiomics and the Combined models in the training set were 0.797 and 0.821, respectively. Clinical-imaging, Radiomics, and Combined models had AUCs of 0.666, 0.796, and 0.802 in the test set. Based on the IDI results, the combined model outperformed the Clinical-imaging and Radiomics models in the training set by 11.8% and 2.1%, respectively. They increased by 11% and 1.74% in the test set. SHAP analysis showed that ph2-original-shape-surface volume ratio was the most important feature of the model. Based on clinical-imaging features and DCE-MRI radiomics, the interpretable machine learning model can accurately predict the expression of Ki-67 in BC. Combining the SHAP algorithm with the model improves its interpretability, which may assist clinicians in formulating more accurate treatment strategies.
本研究旨在开发一种可解释的机器学习模型,该模型使用动态对比增强磁共振成像(DCE-MRI)影像组学和临床影像特征的组合,准确预测乳腺癌(BC)患者的Ki-67表达。共回顾性纳入了195例BC患者,包括201个病灶。这些病灶被随机分为训练集和测试集(7:3)。通过单因素分析和二元逻辑回归分析临床影像特征与Ki-67表达之间的相关性,从而开发出临床影像模型。基于DCE-MRI的早期和延迟期提取影像组学特征。这些特征通过Pearson相关系数和递归特征消除(RFE)进行筛选。使用逻辑回归分类器开发影像组学模型。将临床影像和影像组学特征相结合形成联合模型。采用Shapley加性解释(SHAP)算法解释最优模型,并使用曲线下面积(AUC)评估模型性能。在影像特征中,Ki-67表达与内部强化模式和坏死明显不同。与临床影像模型(AUC = 0.682)相比,训练集中影像组学模型和联合模型的AUC分别为0.797和0.821。临床影像模型、影像组学模型和联合模型在测试集中的AUC分别为0.666、0.796和0.802。基于净重新分类改善指数(IDI)结果,联合模型在训练集中分别比临床影像模型和影像组学模型表现优11.8%和2.1%。在测试集中分别提高了11%和1.74%。SHAP分析表明,ph2-原始形状-表面体积比是该模型最重要的特征。基于临床影像特征和DCE-MRI影像组学,可解释的机器学习模型能够准确预测BC中Ki-67的表达。将SHAP算法与该模型相结合可提高其可解释性,这可能有助于临床医生制定更准确的治疗策略。