Sun Xiaoyan
Obstetrics and Gynecology, Jinan Maternity and Child Care Hospital, Jinan, 250000, Shandong, China.
Sci Rep. 2025 Jan 20;15(1):2569. doi: 10.1038/s41598-025-86014-x.
Cancer, as a shocking disease, is one of the most common malignant tumors among women, posing a huge threat to the physical health and safety of women worldwide. With the continuous development of science and technology, more and more high and new technologies are involved in the diagnosis and prediction of breast cancer. In recent years, intelligent medical assistants supported by data mining and machine learning algorithms have provided necessary support for doctors' diagnosis. This study proposes an improved LightGBM hybrid integration model. Introducing gradient harmonic loss and cross entropy loss to enhance the model's attention to minority classes in the dataset and alleviate the impact of data imbalance on diagnostic results. Designing whale optimization algorithm to improve LightGBM to achieve iterative optimization of hyperparameters, and enhance the overall performance of the model. Proposing Jacobian regularization method to denoise LightGBM to solve the problem of model sensitivity to noise. Developing the LightGBM hybrid integration model to ensure the accuracy and stability of model diagnosis on diverse and imbalanced datasets. The effectiveness of the proposed method has been comprehensively compared and verified through the dataset in the UCI machine learning repository, and the results show that the proposed method has achieved good diagnostic performance in all indicators. The hybrid integration model proposed in this paper can provide effective auxiliary support for doctors to diagnose breast cancer.
癌症作为一种令人震惊的疾病,是女性中最常见的恶性肿瘤之一,对全球女性的身体健康和安全构成了巨大威胁。随着科技的不断发展,越来越多的高新技术被应用于乳腺癌的诊断和预测。近年来,由数据挖掘和机器学习算法支持的智能医疗助手为医生的诊断提供了必要的支持。本研究提出了一种改进的LightGBM混合集成模型。引入梯度谐波损失和交叉熵损失,以增强模型对数据集中少数类别的关注,并减轻数据不平衡对诊断结果的影响。设计鲸鱼优化算法对LightGBM进行改进,以实现超参数的迭代优化,提高模型的整体性能。提出雅可比正则化方法对LightGBM进行去噪,解决模型对噪声敏感的问题。开发LightGBM混合集成模型,以确保模型在多样化数据上诊断的准确性和稳定性。通过UCI机器学习库中的数据集对所提方法的有效性进行了全面比较和验证,结果表明所提方法在各项指标上均取得了良好的诊断性能。本文提出的混合集成模型可为医生诊断乳腺癌提供有效的辅助支持。