Arravalli Tharunya, Chadaga Krishnaraj, Muralikrishna H, Sampathila Niranjana, Cenitta D, Chadaga Rajagopala, Swathi K S
Department of Electronics and Communication Engineering, Manipal Institute of Technology, Manipal Academy of Higher Education, Manipal, Karnataka, 576104, India.
Manipal Institute of Technology, Manipal Academy of Higher Education, Manipal, 576104, India.
Sci Rep. 2025 Jul 24;15(1):26931. doi: 10.1038/s41598-025-12644-w.
Breast cancer is characterized by the proliferation of abnormal breast cells that eventually turn into malignant tumors. These cancer cells can metastasize to be life-threatening and fatal. An intricate mix of environmental factors and individual genetic composition can lead to the formation of this deadly carcinoma. Improvements in the diagnosis and treatment of cancer are essential given the rising incidence of breast cancer. Over the past few decades, machine learning has helped provide accurate medical diagnosis results. Therefore, this study used diagnostic characteristics of patients and multiple machine learning classifiers to identify breast cancer. Incorporating explainable artificial intelligence techniques revealed the underlying factors for the model predictions, adding a layer of transparency and interpretability. Out of the algorithms, random forest showed the best result, an F1-score of 84%. The stacked ensemble model, which combines the strengths of different models, obtained an F1-score performance of 83%. The research emphasized the results obtained by explainers such as SHAP (SHapley Additive exPlanations), LIME (Local Interpretable Model-agnostic Explanations), ELI5 (Explain Like I'm Five), Anchor and QLattice (Quantum Lattice) to decipher the findings. Interpretable algorithms can be applied in the medical sector to assist practitioners in predicting breast cancer, reducing diagnostic errors, and improving clinical decision-making.
乳腺癌的特征是异常乳腺细胞增殖,最终形成恶性肿瘤。这些癌细胞会转移,从而危及生命并导致死亡。环境因素和个体基因构成的复杂组合会导致这种致命癌症的形成。鉴于乳腺癌发病率不断上升,改善癌症的诊断和治疗至关重要。在过去几十年里,机器学习有助于提供准确的医学诊断结果。因此,本研究利用患者的诊断特征和多种机器学习分类器来识别乳腺癌。纳入可解释人工智能技术揭示了模型预测的潜在因素,增加了一层透明度和可解释性。在这些算法中,随机森林表现最佳,F1分数为84%。结合不同模型优势的堆叠集成模型获得了83%的F1分数性能。该研究强调了通过SHAP(SHapley加性解释)、LIME(局部可解释模型无关解释)、ELI5(像给五岁孩子解释一样解释)、Anchor和QLattice(量子晶格)等解释器获得的结果,以解读研究结果。可解释算法可应用于医疗领域,以协助从业者预测乳腺癌、减少诊断错误并改善临床决策。