Division of Colorectal Surgery, Department of Surgery, Chung Shan Medical University Hospital, Taiwan.
Institute of Medicine, Chung Shan Medical University, Taiwan.
Int J Med Sci. 2020 Jan 15;17(3):280-291. doi: 10.7150/ijms.37134. eCollection 2020.
Colorectal cancer (CRC) is the third commonly diagnosed cancer worldwide. Recurrence of CRC (Re) and onset of a second primary malignancy (SPM) are important indicators in treating CRC, but it is often difficult to predict the onset of a SPM. Therefore, we used mechanical learning to identify risk factors that affect Re and SPM.
CRC patients with cancer registry database at three medical centers were identified. All patients were classified based on Re or no recurrence (NRe) as well as SPM or no SPM (NSPM). Two classifiers, namely A Library for Support Vector Machines (LIBSVM) and Reduced Error Pruning Tree (REPTree), were applied to analyze the relationship between clinical features and Re and/or SPM category by constructing optimized models.
When Re and SPM were evaluated separately, the accuracy of LIBSVM was 0.878 and that of REPTree was 0.622. When Re and SPM were evaluated in combination, the precision of models for SPM+Re, NSPM+Re, SPM+NRe, and NSPM+NRe was 0.878, 0.662, 0.774, and 0.778, respectively.
Machine learning can be used to rank factors affecting tumor Re and SPM. In clinical practice, routine checkups are necessary to ensure early detection of new tumors. The success of prediction and early detection may be enhanced in the future by applying "big data" analysis methods such as machine learning.
结直肠癌(CRC)是全球第三大常见癌症。CRC 的复发(Re)和第二原发恶性肿瘤(SPM)的发生是 CRC 治疗的重要指标,但通常难以预测 SPM 的发生。因此,我们使用机械学习来确定影响 Re 和 SPM 的风险因素。
从三个医疗中心的癌症登记数据库中确定了 CRC 患者。所有患者均根据 Re 或无复发(NRe)以及 SPM 或无 SPM(NSPM)进行分类。使用支持向量机库(LIBSVM)和简化错误修剪树(REPTree)两种分类器通过构建优化模型来分析临床特征与 Re 和/或 SPM 类别之间的关系。
当单独评估 Re 和 SPM 时,LIBSVM 的准确性为 0.878,REPTree 的准确性为 0.622。当 Re 和 SPM 一起评估时,SPM+Re、NSPM+Re、SPM+NRe 和 NSPM+NRe 模型的精度分别为 0.878、0.662、0.774 和 0.778。
机器学习可用于对影响肿瘤 Re 和 SPM 的因素进行排序。在临床实践中,有必要进行常规检查,以确保及早发现新的肿瘤。未来,通过应用机器学习等“大数据”分析方法,可能会提高预测和早期检测的成功率。