Department of Oncology, Turku University Hospital, Turku, Finland.
Faculty of Medicine, University of Turku, Turku, Finland.
Acta Oncol. 2020 Jun;59(6):689-695. doi: 10.1080/0284186X.2020.1736332. Epub 2020 Mar 9.
The current standard for evaluating axillary nodal burden in clinically node negative breast cancer is sentinel lymph node biopsy (SLNB). However, the accuracy of SLNB to detect nodal stage N2-3 remains debatable. Nomograms can help the decision-making process between axillary treatment options. The aim of this study was to create a new model to predict the nodal stage N2-3 after a positive SLNB using machine learning methods that are rarely seen in nomogram development. Primary breast cancer patients who underwent SLNB and axillary lymph node dissection (ALND) between 2012 and 2017 formed cohorts for nomogram development (training cohort, = 460) and for nomogram validation (validation cohort, = 70). A machine learning method known as the gradient boosted trees model (XGBoost) was used to determine the variables associated with nodal stage N2-3 and to create a predictive model. Multivariate logistic regression analysis was used for comparison. The best combination of variables associated with nodal stage N2-3 in XGBoost modeling included tumor size, histological type, multifocality, lymphovascular invasion, percentage of ER positive cells, number of positive sentinel lymph nodes (SLN) and number of positive SLNs multiplied by tumor size. Indicating discrimination, AUC values for the training cohort and the validation cohort were 0.80 (95%CI 0.71-0.89) and 0.80 (95%CI 0.65-0.92) in the XGBoost model and 0.85 (95%CI 0.77-0.93) and 0.75 (95%CI 0.58-0.89) in the logistic regression model, respectively. This machine learning model was able to maintain its discrimination in the validation cohort better than the logistic regression model. This indicates advantages in employing modern artificial intelligence techniques into nomogram development. The nomogram could be used to help identify nodal stage N2-3 in early breast cancer and to select appropriate treatments for patients.
当前评估临床淋巴结阴性乳腺癌腋窝淋巴结负担的标准是前哨淋巴结活检(SLNB)。然而,SLNB 检测 N2-3 期淋巴结的准确性仍存在争议。列线图可以帮助决策在腋窝治疗方案之间进行选择。本研究的目的是使用机器学习方法创建一种新的模型,该模型使用机器学习方法很少用于列线图开发,以预测 SLNB 阳性后的 N2-3 期淋巴结。2012 年至 2017 年间接受 SLNB 和腋窝淋巴结清扫术(ALND)的原发性乳腺癌患者形成了列线图开发(训练队列,n=460)和列线图验证(验证队列,n=70)的队列。使用称为梯度提升树模型(XGBoost)的机器学习方法来确定与 N2-3 期淋巴结相关的变量,并创建预测模型。使用多变量逻辑回归分析进行比较。XGBoost 模型中与 N2-3 期淋巴结相关的最佳变量组合包括肿瘤大小、组织学类型、多灶性、脉管侵犯、ER 阳性细胞百分比、阳性前哨淋巴结(SLN)数量和阳性 SLN 数量乘以肿瘤大小。表明具有判别力,训练队列和验证队列的 AUC 值分别为 0.80(95%CI 0.71-0.89)和 0.80(95%CI 0.65-0.92)在 XGBoost 模型和 0.85(95%CI 0.77-0.93)和 0.75(95%CI 0.58-0.89)在逻辑回归模型中。与逻辑回归模型相比,该机器学习模型在验证队列中更好地保持了其判别力。这表明在列线图开发中采用现代人工智能技术具有优势。该列线图可用于帮助识别早期乳腺癌的 N2-3 期淋巴结,并为患者选择合适的治疗方法。