Faculty of Health, Sports and Social Work, Inholland University of Applied Sciences, Amsterdam, the Netherlands.
Tranzo, Tilburg University, Tilburg, the Netherlands.
Clin Interv Aging. 2023 Nov 14;18:1873-1882. doi: 10.2147/CIA.S428036. eCollection 2023.
Advanced statistical modeling techniques may help predict health outcomes. However, it is not the case that these modeling techniques always outperform traditional techniques such as regression techniques. In this study, external validation was carried out for five modeling strategies for the prediction of the disability of community-dwelling older people in the Netherlands.
We analyzed data from five studies consisting of community-dwelling older people in the Netherlands. For the prediction of the total disability score as measured with the Groningen Activity Restriction Scale (GARS), we used fourteen predictors as measured with the Tilburg Frailty Indicator (TFI). Both the TFI and the GARS are self-report questionnaires. For the modeling, five statistical modeling techniques were evaluated: general linear model (GLM), support vector machine (SVM), neural net (NN), recursive partitioning (RP), and random forest (RF). Each model was developed on one of the five data sets and then applied to each of the four remaining data sets. We assessed the performance of the models with calibration characteristics, the correlation coefficient, and the root of the mean squared error.
The models GLM, SVM, RP, and RF showed satisfactory performance characteristics when validated on the validation data sets. All models showed poor performance characteristics for the deviating data set both for development and validation due to the deviating baseline characteristics compared to those of the other data sets.
The performance of four models (GLM, SVM, RP, RF) on the development data sets was satisfactory. This was also the case for the validation data sets, except when these models were developed on the deviating data set. The NN models showed a much worse performance on the validation data sets than on the development data sets.
先进的统计建模技术可以帮助预测健康结果。然而,这些建模技术并不总是优于传统技术,如回归技术。在这项研究中,对荷兰社区居住老年人残疾预测的五种建模策略进行了外部验证。
我们分析了来自荷兰五个社区居住老年人研究的数据。为了预测用格罗宁根活动限制量表(GARS)测量的总残疾评分,我们使用了十四项用蒂尔堡脆弱性指标(TFI)测量的预测因子。TFI 和 GARS 都是自我报告问卷。对于建模,我们评估了五种统计建模技术:普通线性模型(GLM)、支持向量机(SVM)、神经网络(NN)、递归分区(RP)和随机森林(RF)。每个模型都是在五个数据集之一上开发的,然后应用于其余四个数据集。我们使用校准特征、相关系数和均方根误差的平方根来评估模型的性能。
在验证数据集上,GLM、SVM、RP 和 RF 模型显示出令人满意的性能特征。由于与其他数据集相比,偏离数据集的基线特征不同,所有模型在开发和验证时对偏离数据集的性能特征都较差。
四个模型(GLM、SVM、RP、RF)在开发数据集上的性能令人满意。这在验证数据集上也是如此,除了这些模型是在偏离数据集上开发的。NN 模型在验证数据集上的性能明显比在开发数据集上差。