Ahn Ji Hyun, Kwak Min Seob, Lee Hun Hee, Cha Jae Myung, Shin Hyun Phil, Jeon Jung Won, Yoon Jin Young
Department of Internal Medicine, Kyung Hee University Hospital at Gangdong, Kyung Hee University College of Medicine, Seoul, South Korea.
Front Oncol. 2021 Mar 25;11:614398. doi: 10.3389/fonc.2021.614398. eCollection 2021.
Identification of a simplified prediction model for lymph node metastasis (LNM) for patients with early colorectal cancer (CRC) is urgently needed to determine treatment and follow-up strategies. Therefore, in this study, we aimed to develop an accurate predictive model for LNM in early CRC.
We analyzed data from the 2004-2016 Surveillance Epidemiology and End Results database to develop and validate prediction models for LNM. Seven models, namely, logistic regression, XGBoost, k-nearest neighbors, classification and regression trees model, support vector machines, neural network, and random forest (RF) models, were used.
A total of 26,733 patients with a diagnosis of early CRC (T1) were analyzed. The models included 8 independent prognostic variables; age at diagnosis, sex, race, primary site, histologic type, tumor grade, and, tumor size. LNM was significantly more frequent in patients with larger tumors, women, younger patients, and patients with more poorly differentiated tumor. The RF model showed the best predictive performance in comparison to the other method, achieving an accuracy of 96.0%, a sensitivity of 99.7%, a specificity of 92.9%, and an area under the curve of 0.991. Tumor size is the most important features in predicting LNM in early CRC.
We established a simplified reproducible predictive model for LNM in early CRC that could be used to guide treatment decisions. These findings warrant further confirmation in large prospective clinical trials.
迫切需要为早期结直肠癌(CRC)患者确定一种简化的淋巴结转移(LNM)预测模型,以确定治疗和随访策略。因此,在本研究中,我们旨在开发一种用于早期CRC中LNM的准确预测模型。
我们分析了2004 - 2016年监测、流行病学和最终结果数据库中的数据,以开发和验证LNM的预测模型。使用了七种模型,即逻辑回归、XGBoost、k近邻、分类与回归树模型、支持向量机、神经网络和随机森林(RF)模型。
共分析了26733例诊断为早期CRC(T1)的患者。这些模型包括8个独立的预后变量;诊断时的年龄、性别、种族、原发部位、组织学类型、肿瘤分级和肿瘤大小。肿瘤较大的患者、女性、年轻患者以及肿瘤分化较差的患者中LNM的发生率明显更高。与其他方法相比,RF模型表现出最佳的预测性能,准确率为96.0%,灵敏度为99.7%,特异性为92.9%,曲线下面积为0.991。肿瘤大小是预测早期CRC中LNM的最重要特征。
我们建立了一种用于早期CRC中LNM的简化可重复预测模型,可用于指导治疗决策。这些发现需要在大型前瞻性临床试验中进一步证实。