Suppr超能文献

痴呆预测中的数据挖掘方法:线性判别分析、逻辑回归、神经网络、支持向量机、分类树和随机森林在准确性、敏感性和特异性方面的实际数据比较。

Data mining methods in the prediction of Dementia: A real-data comparison of the accuracy, sensitivity and specificity of linear discriminant analysis, logistic regression, neural networks, support vector machines, classification trees and random forests.

作者信息

Maroco João, Silva Dina, Rodrigues Ana, Guerreiro Manuela, Santana Isabel, de Mendonça Alexandre

机构信息

Unidade de Investigação em Psicologia e Saúde & Departamento de Estatística, ISPA - Instituto Universitário, Rua Jardim do Tabaco 44, 1149-041 Lisboa, Portugal.

出版信息

BMC Res Notes. 2011 Aug 17;4:299. doi: 10.1186/1756-0500-4-299.

Abstract

BACKGROUND

Dementia and cognitive impairment associated with aging are a major medical and social concern. Neuropsychological testing is a key element in the diagnostic procedures of Mild Cognitive Impairment (MCI), but has presently a limited value in the prediction of progression to dementia. We advance the hypothesis that newer statistical classification methods derived from data mining and machine learning methods like Neural Networks, Support Vector Machines and Random Forests can improve accuracy, sensitivity and specificity of predictions obtained from neuropsychological testing. Seven non parametric classifiers derived from data mining methods (Multilayer Perceptrons Neural Networks, Radial Basis Function Neural Networks, Support Vector Machines, CART, CHAID and QUEST Classification Trees and Random Forests) were compared to three traditional classifiers (Linear Discriminant Analysis, Quadratic Discriminant Analysis and Logistic Regression) in terms of overall classification accuracy, specificity, sensitivity, Area under the ROC curve and Press'Q. Model predictors were 10 neuropsychological tests currently used in the diagnosis of dementia. Statistical distributions of classification parameters obtained from a 5-fold cross-validation were compared using the Friedman's nonparametric test.

RESULTS

Press' Q test showed that all classifiers performed better than chance alone (p < 0.05). Support Vector Machines showed the larger overall classification accuracy (Median (Me) = 0.76) an area under the ROC (Me = 0.90). However this method showed high specificity (Me = 1.0) but low sensitivity (Me = 0.3). Random Forest ranked second in overall accuracy (Me = 0.73) with high area under the ROC (Me = 0.73) specificity (Me = 0.73) and sensitivity (Me = 0.64). Linear Discriminant Analysis also showed acceptable overall accuracy (Me = 0.66), with acceptable area under the ROC (Me = 0.72) specificity (Me = 0.66) and sensitivity (Me = 0.64). The remaining classifiers showed overall classification accuracy above a median value of 0.63, but for most sensitivity was around or even lower than a median value of 0.5.

CONCLUSIONS

When taking into account sensitivity, specificity and overall classification accuracy Random Forests and Linear Discriminant analysis rank first among all the classifiers tested in prediction of dementia using several neuropsychological tests. These methods may be used to improve accuracy, sensitivity and specificity of Dementia predictions from neuropsychological testing.

摘要

背景

与衰老相关的痴呆和认知障碍是主要的医学和社会问题。神经心理学测试是轻度认知障碍(MCI)诊断程序中的关键要素,但目前在预测向痴呆进展方面价值有限。我们提出假设,源自数据挖掘和机器学习方法(如神经网络、支持向量机和随机森林)的更新统计分类方法可以提高从神经心理学测试获得的预测的准确性、敏感性和特异性。将源自数据挖掘方法的七个非参数分类器(多层感知器神经网络、径向基函数神经网络、支持向量机、CART、CHAID和QUEST分类树以及随机森林)与三个传统分类器(线性判别分析、二次判别分析和逻辑回归)在总体分类准确性、特异性、敏感性、ROC曲线下面积和Press'Q方面进行了比较。模型预测指标是目前用于痴呆诊断中的10项神经心理学测试。使用Friedman非参数检验比较了从5折交叉验证获得的分类参数的统计分布。

结果

Press'Q检验表明,所有分类器的表现均优于随机水平(p < 0.05)。支持向量机显示出较大的总体分类准确性(中位数(Me)= 0.76)和ROC曲线下面积(Me = 0.90)。然而,该方法显示出高特异性(Me = 1.0)但低敏感性(Me = 0.3)。随机森林在总体准确性方面排名第二(Me = 0.73),ROC曲线下面积高(Me = 0.73),特异性(Me = 0.73)和敏感性(Me = 0.64)。线性判别分析也显示出可接受的总体准确性(Me = 0.66),ROC曲线下面积可接受(Me = 0.72),特异性(Me = 0.66)和敏感性(Me = 0.64)。其余分类器显示总体分类准确性高于中位数0.63,但大多数的敏感性约为或甚至低于中位数0.5。

结论

在考虑敏感性、特异性和总体分类准确性时,随机森林和线性判别分析在使用多项神经心理学测试预测痴呆的所有测试分类器中排名第一。这些方法可用于提高神经心理学测试中痴呆预测的准确性、敏感性和特异性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b906/3180705/d232717bd7b6/1756-0500-4-299-1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验