使用Boruta和套索机器学习算法识别与乳腺癌远处转移相关的最佳生物标志物。

Identification of optimal biomarkers associated with distant metastasis in breast cancer using Boruta and Lasso machine learning algorithms.

作者信息

Qin Jia-Ning, Dai Wen-Bin, Zhang Wen-Hai, Chen Bin-Jie, Liang Ling, Liang Chun-Feng, Lu Chun-Guo, Tan Qi-Xing, Wei Chang-Yuan, Tan Yang, Wu Fang

机构信息

Department of Radiation Oncology, The First Affiliated Hospital of Guangxi Medical University, No.6 Shuangyong Road, Nanning, 530000, Guangxi, China.

Department of Pathology, Liuzhou People's Hospital, Guangxi Medical University, Liuzhou, China.

出版信息

BMC Cancer. 2025 Aug 13;25(1):1311. doi: 10.1186/s12885-025-14664-1.

Abstract

OBJECTIVE

The aim of this study was to identify optimal biomarkers associated with distant metastasis in patients with breast cancer from among nutritional and inflammatory indicators using the Boruta and Least Absolute Shrinkage and Selection Operator (LASSO) machine learning algorithms, thereby improving the ability to identify distant metastasis.

METHODS

A total of 348 patients newly diagnosed with breast cancer were included, comprising 185 patients with nonmetastatic breast cancer and 163 patients with distant metastatic breast cancer. The variables were initially screened using the Boruta algorithm, followed by further optimization through LASSO regression. The selected key indicators were evaluated for their association with distant metastasis risk using multivariate logistic regression analysis and restricted cubic spline functions. Discriminative performance was assessed through ROC curve analysis.

RESULTS

Boruta and LASSO analyses identified five important indicators: the advanced lung cancer inflammation index (ALI), systemic inflammation response index (SIRI), monocyte-to-lymphocyte ratio (MLR), albumin-to-globulin ratio (AGR), and geriatric nutritional risk index (GNRI). Multivariate logistic regression analysis revealed that an elevated SIRI and MLR were associated with an increased risk of distant metastasis in patients with breast cancer, whereas a higher ALI, AGR, and GNRI were associated with a reduced risk. ROC analysis indicated moderate predictive performance for these indicators, with AUC values of approximately 0.65.

CONCLUSION

The ALI, SIRI, MLR, AGR, and GNRI are effective biomarkers for identifying the risk of distant metastasis in patients with breast cancer. These indicators could be incorporated into clinical practice to improve risk stratification, guide personalized treatment, and enhance patient outcomes.

摘要

目的

本研究旨在利用Boruta算法和最小绝对收缩与选择算子(LASSO)机器学习算法,从营养和炎症指标中识别出与乳腺癌患者远处转移相关的最佳生物标志物,从而提高识别远处转移的能力。

方法

共纳入348例新诊断的乳腺癌患者,其中185例为非转移性乳腺癌患者,163例为远处转移性乳腺癌患者。变量首先使用Boruta算法进行筛选,然后通过LASSO回归进一步优化。使用多因素逻辑回归分析和受限立方样条函数评估所选关键指标与远处转移风险的相关性。通过ROC曲线分析评估判别性能。

结果

Boruta分析和LASSO分析确定了五个重要指标:晚期肺癌炎症指数(ALI)、全身炎症反应指数(SIRI)、单核细胞与淋巴细胞比值(MLR)、白蛋白与球蛋白比值(AGR)和老年营养风险指数(GNRI)。多因素逻辑回归分析显示,SIRI和MLR升高与乳腺癌患者远处转移风险增加相关,而ALI、AGR和GNRI升高则与风险降低相关。ROC分析表明这些指标具有中等预测性能,AUC值约为0.65。

结论

ALI、SIRI、MLR、AGR和GNRI是识别乳腺癌患者远处转移风险的有效生物标志物。这些指标可纳入临床实践,以改善风险分层、指导个性化治疗并提高患者预后。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索