Suppr超能文献

基于特征选择和机器学习算法的脊柱异常数据驱动诊断。

Data-driven diagnosis of spinal abnormalities using feature selection and machine learning algorithms.

机构信息

Institute of Information and Communication Technology, Bangladesh University of Engineering and Technology, Dhaka, Bangladesh.

出版信息

PLoS One. 2020 Feb 6;15(2):e0228422. doi: 10.1371/journal.pone.0228422. eCollection 2020.

Abstract

This paper focuses on the application of machine learning algorithms for predicting spinal abnormalities. As a data preprocessing step, univariate feature selection as a filter based feature selection, and principal component analysis (PCA) as a feature extraction algorithm are considered. A number of machine learning approaches namely support vector machine (SVM), logistic regression (LR), bagging ensemble methods are considered for the diagnosis of spinal abnormality. The SVM, LR, bagging SVM and bagging LR models are applied on a dataset of 310 samples publicly available in Kaggle repository. The performance of classification of abnormal and normal spinal patients is evaluated in terms of a number of factors including training and testing accuracy, recall, and miss rate. The classifier models are also evaluated by optimizing the kernel parameters, and by using the results of receiver operating characteristic (ROC) and precision-recall curves. Results indicate that when 78% data are used for training, the observed training accuracies for SVM, LR, bagging SVM and bagging LR are 86.30%, 85.47%, 86.72% and 85.06%, respectively. On the other hand, the accuracies for the test dataset for SVM, LR, bagging SVM and bagging LR are the same being 86.96%. However, bagging SVM is the most attractive as it has a higher recall value and a lower miss rate compared to others. Hence, bagging SVM is suitable for the classification of spinal patients when applied on the most five important features of spinal samples.

摘要

本文专注于机器学习算法在预测脊柱异常方面的应用。作为数据预处理步骤,考虑了单变量特征选择作为基于过滤器的特征选择,以及主成分分析(PCA)作为特征提取算法。考虑了几种机器学习方法,即支持向量机(SVM)、逻辑回归(LR)、套袋集成方法,用于诊断脊柱异常。SVM、LR、套袋 SVM 和套袋 LR 模型应用于 Kaggle 存储库中公开的 310 个样本数据集。通过使用接收器操作特性(ROC)和精度-召回曲线的结果来优化核参数,对分类器模型进行评估。根据训练和测试准确性、召回率和漏报率等多个因素评估异常和正常脊柱患者的分类性能。结果表明,当使用 78%的数据进行训练时,SVM、LR、套袋 SVM 和套袋 LR 的观察训练准确性分别为 86.30%、85.47%、86.72%和 85.06%。另一方面,SVM、LR、套袋 SVM 和套袋 LR 的测试数据集的准确性相同,均为 86.96%。然而,套袋 SVM 是最具吸引力的,因为与其他方法相比,它具有更高的召回值和更低的漏报率。因此,当应用于脊柱样本的最重要的五个特征时,套袋 SVM 适合于脊柱患者的分类。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a120/7004343/fba188b0f8a7/pone.0228422.g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验