Suppr超能文献

支持向量机及其集成方法在乳腺癌预测中的应用

SVM and SVM Ensembles in Breast Cancer Prediction.

作者信息

Huang Min-Wei, Chen Chih-Wen, Lin Wei-Chao, Ke Shih-Wen, Tsai Chih-Fong

机构信息

Department of Psychiatry, Chiayi Branch, Taichung Veterans General Hospital, Chiayi, Taiwan.

Department of Pharmacy, Kaohsiung Municipal Chinese Medical Hospital, Kaohsiung, Taiwan.

出版信息

PLoS One. 2017 Jan 6;12(1):e0161501. doi: 10.1371/journal.pone.0161501. eCollection 2017.

Abstract

Breast cancer is an all too common disease in women, making how to effectively predict it an active research problem. A number of statistical and machine learning techniques have been employed to develop various breast cancer prediction models. Among them, support vector machines (SVM) have been shown to outperform many related techniques. To construct the SVM classifier, it is first necessary to decide the kernel function, and different kernel functions can result in different prediction performance. However, there have been very few studies focused on examining the prediction performances of SVM based on different kernel functions. Moreover, it is unknown whether SVM classifier ensembles which have been proposed to improve the performance of single classifiers can outperform single SVM classifiers in terms of breast cancer prediction. Therefore, the aim of this paper is to fully assess the prediction performance of SVM and SVM ensembles over small and large scale breast cancer datasets. The classification accuracy, ROC, F-measure, and computational times of training SVM and SVM ensembles are compared. The experimental results show that linear kernel based SVM ensembles based on the bagging method and RBF kernel based SVM ensembles with the boosting method can be the better choices for a small scale dataset, where feature selection should be performed in the data pre-processing stage. For a large scale dataset, RBF kernel based SVM ensembles based on boosting perform better than the other classifiers.

摘要

乳腺癌是女性中极为常见的疾病,这使得如何有效预测乳腺癌成为一个活跃的研究课题。许多统计和机器学习技术已被用于开发各种乳腺癌预测模型。其中,支持向量机(SVM)已被证明优于许多相关技术。要构建SVM分类器,首先需要确定核函数,不同的核函数会导致不同的预测性能。然而,很少有研究专注于检验基于不同核函数的SVM的预测性能。此外,为提高单分类器性能而提出的SVM分类器集成在乳腺癌预测方面是否能优于单SVM分类器尚不清楚。因此,本文的目的是全面评估SVM和SVM集成在小规模和大规模乳腺癌数据集上的预测性能。比较了训练SVM和SVM集成的分类准确率、ROC、F值和计算时间。实验结果表明,基于装袋法的线性核SVM集成和基于提升法的RBF核SVM集成对于小规模数据集可能是更好的选择,在数据预处理阶段应进行特征选择。对于大规模数据集,基于提升法的RBF核SVM集成比其他分类器表现更好。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7943/5217832/4cee23cc05b0/pone.0161501.g001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验