College of Management and Economics, Tianjin University, Tianjin 300072, China.
Business School, Nankai University, Tianjin 300071, China.
Comput Math Methods Med. 2021 Sep 10;2021:2213194. doi: 10.1155/2021/2213194. eCollection 2021.
Predicting postoperative survival of lung cancer patients (LCPs) is an important problem of medical decision-making. However, the imbalanced distribution of patient survival in the dataset increases the difficulty of prediction. Although the synthetic minority oversampling technique (SMOTE) can be used to deal with imbalanced data, it cannot identify data noise. On the other hand, many studies use a support vector machine (SVM) combined with resampling technology to deal with imbalanced data. However, most studies require manual setting of SVM parameters, which makes it difficult to obtain the best performance. In this paper, a hybrid improved SMOTE and adaptive SVM method is proposed for imbalance data to predict the postoperative survival of LCPs. The proposed method is divided into two stages: in the first stage, the cross-validated committees filter (CVCF) is used to remove noise samples to improve the performance of SMOTE. In the second stage, we propose an adaptive SVM, which uses fuzzy self-tuning particle swarm optimization (FPSO) to optimize the parameters of SVM. Compared with other advanced algorithms, our proposed method obtains the best performance with 95.11% accuracy, 95.10% -mean, 95.02% F1, and 95.10% area under the curve (AUC) for predicting postoperative survival of LCPs.
预测肺癌患者(LCP)的术后生存情况是医学决策中的一个重要问题。然而,数据集中文本患者生存情况的不平衡分布增加了预测的难度。虽然可以使用合成少数类过采样技术(SMOTE)来处理不平衡数据,但它无法识别数据噪声。另一方面,许多研究使用支持向量机(SVM)结合重采样技术来处理不平衡数据。然而,大多数研究都需要手动设置 SVM 参数,这使得难以获得最佳性能。在本文中,提出了一种混合改进的 SMOTE 和自适应 SVM 方法来预测 LCP 的术后生存情况。该方法分为两个阶段:在第一阶段,使用交叉验证委员会过滤(CVCF)去除噪声样本,以提高 SMOTE 的性能。在第二阶段,我们提出了一种自适应 SVM,它使用模糊自整定粒子群优化(FPSO)来优化 SVM 的参数。与其他先进的算法相比,我们的方法在预测 LCP 的术后生存情况方面取得了最佳性能,准确率为 95.11%,均值为 95.10%,F1 值为 95.02%,曲线下面积(AUC)为 95.10%。