Mohebian Mohammad R, Marateb Hamid R, Mansourian Marjan, Mañanas Miguel Angel, Mokarian Fariborz
Biomedical Engineering Department, Engineering Faculty, University of Isfahan, Hezar Jerib St., 81746-73441, Isfahan, Iran.
Biomedical Engineering Department, Engineering Faculty, University of Isfahan, Hezar Jerib St., 81746-73441, Isfahan, Iran; Department of Automatic Control, Biomedical Engineering Research Center, Universitat Politècnica de Catalunya, BarcelonaTech (UPC), C. Pau Gargallo, 5, 08028 Barcelona, Spain.
Comput Struct Biotechnol J. 2016 Dec 6;15:75-85. doi: 10.1016/j.csbj.2016.11.004. eCollection 2017.
Cancer is a collection of diseases that involves growing abnormal cells with the potential to invade or spread to the body. Breast cancer is the second leading cause of cancer death among women. A method for 5-year breast cancer recurrence prediction is presented in this manuscript. Clinicopathologic characteristics of 579 breast cancer patients (recurrence prevalence of 19.3%) were analyzed and discriminative features were selected using statistical feature selection methods. They were further refined by Particle Swarm Optimization (PSO) as the inputs of the classification system with ensemble learning (Bagged Decision Tree: BDT). The proper combination of selected categorical features and also the weight (importance) of the selected interval-measurement-scale features were identified by the PSO algorithm. The performance of HPBCR (hybrid predictor of breast cancer recurrence) was assessed using the holdout and 4-fold cross-validation. Three other classifiers namely as supported vector machines, DT, and multilayer perceptron neural network were used for comparison. The selected features were diagnosis age, tumor size, lymph node involvement ratio, number of involved axillary lymph nodes, progesterone receptor expression, having hormone therapy and type of surgery. The minimum sensitivity, specificity, precision and accuracy of HPBCR were 77%, 93%, 95% and 85%, respectively in the entire cross-validation folds and the hold-out test fold. HPBCR outperformed the other tested classifiers. It showed excellent agreement with the gold standard (i.e. the oncologist opinion after blood tumor marker and imaging tests, and tissue biopsy). This algorithm is thus a promising online tool for the prediction of breast cancer recurrence.
癌症是一组疾病,涉及生长异常细胞,这些细胞有可能侵入或扩散到身体其他部位。乳腺癌是女性癌症死亡的第二大主要原因。本文介绍了一种5年乳腺癌复发预测方法。分析了579例乳腺癌患者的临床病理特征(复发率为19.3%),并使用统计特征选择方法选择了判别特征。通过粒子群优化算法(PSO)对这些特征进行进一步优化,作为集成学习(袋装决策树:BDT)分类系统的输入。PSO算法确定了所选分类特征的适当组合以及所选区间测量尺度特征的权重(重要性)。使用留出法和4折交叉验证评估了HPBCR(乳腺癌复发混合预测器)的性能。使用支持向量机、决策树和多层感知器神经网络这三种其他分类器进行比较。所选特征包括诊断年龄、肿瘤大小、淋巴结受累率、腋窝淋巴结受累数量、孕激素受体表达、是否接受激素治疗以及手术类型。在整个交叉验证折和留出测试折中,HPBCR的最小灵敏度、特异性、精度和准确率分别为77%、93%、95%和85%。HPBCR的表现优于其他测试分类器。它与金标准(即血液肿瘤标志物和影像学检查以及组织活检后肿瘤学家的意见)显示出极好的一致性。因此,该算法是一种很有前景的用于预测乳腺癌复发的在线工具。