Suppr超能文献

高共线性情况下的变量选择与重要性:多频生物电阻抗预测瘦体重的应用

Variable selection and importance in presence of high collinearity: an application to the prediction of lean body mass from multi-frequency bioelectrical impedance.

作者信息

Cammarota Camillo, Pinto Alessandro

机构信息

Department of Mathematics, "Sapienza" University of Rome, Rome, Italy.

Department of Experimental Medicine, Research Unit on "Food Science and Human Nutrition", "Sapienza" University of Rome, Rome, Italy.

出版信息

J Appl Stat. 2020 May 13;48(9):1644-1658. doi: 10.1080/02664763.2020.1763930. eCollection 2021.

Abstract

In prediction problems both response and covariates may have high correlation with a second group of influential regressors, that can be considered as background variables. An important challenge is to perform variable selection and importance assessment among the covariates in the presence of these variables. A clinical example is the prediction of the lean body mass (response) from bioimpedance (covariates), where anthropometric measures play the role of background variables. We introduce a reduced dataset in which the variables are defined as the residuals with respect to the background, and perform variable selection and importance assessment both in linear and random forest models. Using a clinical dataset of multi-frequency bioimpedance, we show the effectiveness of this method to select the most relevant predictors of the lean body mass beyond anthropometry.

摘要

在预测问题中,响应变量和协变量都可能与另一组有影响力的回归变量高度相关,这些回归变量可被视为背景变量。一个重要的挑战是在存在这些变量的情况下,对协变量进行变量选择和重要性评估。一个临床实例是根据生物电阻抗(协变量)预测去脂体重(响应变量),其中人体测量指标起到背景变量的作用。我们引入一个简化数据集,其中变量被定义为相对于背景的残差,并在线性模型和随机森林模型中进行变量选择和重要性评估。使用多频生物电阻抗的临床数据集,我们展示了该方法在选择超出人体测量学范畴的最相关去脂体重预测因子方面的有效性。

相似文献

引用本文的文献

4
Enhancing selection of alcohol consumption-associated genes by random forest.随机森林增强酒精消费相关基因的选择。
Br J Nutr. 2024 Jun 28;131(12):2058-2067. doi: 10.1017/S0007114524000795. Epub 2024 Apr 12.

本文引用的文献

2
The application of a decision tree to establish the parameters associated with hypertension.应用决策树来确定与高血压相关的参数。
Comput Methods Programs Biomed. 2017 Feb;139:83-91. doi: 10.1016/j.cmpb.2016.10.020. Epub 2016 Oct 24.
10
Conditional variable importance for random forests.随机森林的条件变量重要性
BMC Bioinformatics. 2008 Jul 11;9:307. doi: 10.1186/1471-2105-9-307.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验