Liu Chao, Zhao Zeyin, Gu Xi, Sun Lisha, Chen Guanglei, Zhang Hao, Jiang Yanlin, Zhang Yixiao, Cui Xiaoyu, Liu Caigang
Department of Breast Surgery, Shengjing Hospital of China Medical University, Shenyang, China.
Sino-Dutch Biomedical and Information Engineering School, Northeastern University, Shenyang, China.
Front Oncol. 2019 Apr 16;9:282. doi: 10.3389/fonc.2019.00282. eCollection 2019.
Lymph node metastasis is a multifactorial event. Several scholars have developed nomograph models to predict the sentinel lymph nodes (SLN) metastasis before operation. According to the clinical and pathological characteristics of breast cancer patients, we use the new method to establish a more comprehensive model and add some new factors which have never been analyzed in the world and explored the prospect of its clinical application. The clinicopathological data of 633 patients with breast cancer who underwent SLN examination from January 2011 to December 2014 were retrospectively analyzed. Because of the imbalance in data, we used smote algorithm to oversample the data to increase the balanced amount of data. Our study for the first time included the shape of the tumor and breast gland content. The location of the tumor was analyzed by the vector combining quadrant method, at the same time we use the method of simply using quadrant or vector for comparing. We also compared the predictive ability of building models through logistic regression and Bagged-Tree algorithm. The Bagged-Tree algorithm was used to categorize samples. The SMOTE-Bagged Tree algorithm and 5-fold cross-validation was used to established the prediction model. The clinical application value of the model in early breast cancer patients was evaluated by confusion matrix and the area under receiver operating characteristic (ROC) curve (AUC). Our predictive model included 12 variables as follows: age, body mass index (BMI), quadrant, clock direction, the distance of tumor from the nipple, morphology of tumor molybdenum target, glandular content, tumor size, ER, PR, HER2, and Ki-67.Finally, our model obtained the AUC value of 0.801 and the accuracy of 70.3%.We used logistic regression to established the model, in the modeling and validation groups, the area under the curve (AUC) were 0.660 and 0.580.We used the vector combining quadrant method to analyze the original location of the tumor, which is more precise than simply using vector or quadrant (AUC 0.801 vs. 0.791 vs. 0.701, Accuracy 70.3 vs. 70.3 vs. 63.6%). Our model is more reliable and stable to assist doctors predict the SLN metastasis in breast cancer patients before operation.
淋巴结转移是一个多因素事件。几位学者已经开发出列线图模型来预测术前前哨淋巴结(SLN)转移情况。根据乳腺癌患者的临床和病理特征,我们采用新方法建立了一个更全面的模型,并添加了一些在世界上从未被分析过的新因素,同时探讨了其临床应用前景。对2011年1月至2014年12月期间接受SLN检查的633例乳腺癌患者的临床病理数据进行回顾性分析。由于数据不均衡,我们使用SMOTE算法对数据进行过采样以增加数据的均衡量。我们的研究首次纳入了肿瘤形状和乳腺腺体含量。采用向量结合象限法分析肿瘤位置,同时使用单纯象限法或向量法进行比较。我们还比较了通过逻辑回归和Bagged-Tree算法构建模型的预测能力。使用Bagged-Tree算法对样本进行分类。采用SMOTE-Bagged Tree算法和5折交叉验证建立预测模型。通过混淆矩阵和受试者操作特征(ROC)曲线下面积(AUC)评估该模型在早期乳腺癌患者中的临床应用价值。我们的预测模型包括以下12个变量:年龄、体重指数(BMI)、象限、时钟方向、肿瘤距乳头的距离、肿瘤钼靶形态、腺体含量、肿瘤大小、雌激素受体(ER)、孕激素受体(PR)、人表皮生长因子受体2(HER2)和Ki-67。最后,我们的模型获得的AUC值为0.801,准确率为70.3%。我们使用逻辑回归建立模型,在建模组和验证组中,曲线下面积(AUC)分别为0.660和0.580。我们采用向量结合象限法分析肿瘤的原始位置,这比单纯使用向量法或象限法更精确(AUC分别为0.801 vs. 0.791 vs. 0.701,准确率分别为70.3% vs. 70.3% vs. 63.6%)。我们的模型在协助医生预测乳腺癌患者术前SLN转移方面更可靠、更稳定。