Makond Bunjira, Wang Kung-Jeng, Wang Kung-Min
Faculty of Commerce and Management, Prince of Songkla University, Trang, Thailand.
Department of Industrial Management, National Taiwan University of Science and Technology, Taipei 106, Taiwan, ROC.
Comput Methods Programs Biomed. 2015 May;119(3):142-62. doi: 10.1016/j.cmpb.2015.02.005. Epub 2015 Feb 21.
The prediction of substantially short survivability in patients is extremely risky. In this study, we proposed a probabilistic model using Bayesian network (BN) to predict the short survivability of patients with brain metastasis from lung cancer. A nationwide cancer patient database from 1996 to 2010 in Taiwan was used. The cohort consisted of 438 patients with brain metastasis from lung cancer. We utilized synthetic minority over-sampling technique (SMOTE) to solve the imbalanced property embedded in the problem. The proposed BN was compared with three competitive models, namely, naive Bayes (NB), logistic regression (LR), and support vector machine (SVM). Statistical analysis showed that performances of BN, LR, NB, and SVM were statistically the same in terms of all indices with low sensitivity when these models were applied on an imbalanced data set. Results also showed that SMOTE can improve the performance of the four models in terms of sensitivity, while keeping high accuracy and specificity. Further, the proposed BN is more effective as compared with NB, LR, and SVM from two perspectives: the transparency and ability to show the relation of factors affecting brain metastasis from lung cancer; it allows decision makers to find the probability despite incomplete evidence and information; and the sensitivity of the proposed BN is the highest among all standard machine learning methods.
预测患者的显著短期生存能力风险极高。在本研究中,我们提出了一种使用贝叶斯网络(BN)的概率模型来预测肺癌脑转移患者的短期生存能力。我们使用了台湾1996年至2010年的全国癌症患者数据库。该队列由438例肺癌脑转移患者组成。我们利用合成少数过采样技术(SMOTE)来解决该问题中固有的不平衡特性。将所提出的贝叶斯网络与三种竞争模型进行比较,即朴素贝叶斯(NB)、逻辑回归(LR)和支持向量机(SVM)。统计分析表明,当这些模型应用于不平衡数据集时,在所有指标上,贝叶斯网络、逻辑回归、朴素贝叶斯和支持向量机的性能在低敏感性方面在统计学上是相同的。结果还表明,SMOTE可以在保持高准确性和特异性的同时,提高这四种模型在敏感性方面的性能。此外,从两个角度来看,所提出的贝叶斯网络比朴素贝叶斯、逻辑回归和支持向量机更有效:其透明度以及展示影响肺癌脑转移因素之间关系的能力;它允许决策者在证据和信息不完整的情况下找到概率;并且在所提出的贝叶斯网络在所有标准机器学习方法中敏感性最高。