Suppr超能文献

风险分析环境中高度相关数据的加权分位数和回归的特征描述

Characterization of Weighted Quantile Sum Regression for Highly Correlated Data in a Risk Analysis Setting.

作者信息

Carrico Caroline, Gennings Chris, Wheeler David C, Factor-Litvak Pam

机构信息

Department of Biostatistics, School of Medicine, Virginia Commonwealth University, Richmond, VA, USA.

Department of Epidemiology, Mailman School of Public Health, Columbia University, New York, NY, USA.

出版信息

J Agric Biol Environ Stat. 2015 Mar;20(1):100-120. doi: 10.1007/s13253-014-0180-3. Epub 2014 Dec 24.

Abstract

In risk evaluation, the effect of mixtures of environmental chemicals on a common adverse outcome is of interest. However, due to the high dimensionality and inherent correlations among chemicals that occur together, the traditional methods (e.g. ordinary or logistic regression) suffer from collinearity and variance inflation, and shrinkage methods have limitations in selecting among correlated components. We propose a weighted quantile sum (WQS) approach to estimating a body burden index, which identifies "bad actors" in a set of highly correlated environmental chemicals. We evaluate and characterize the accuracy of WQS regression in variable selection through extensive simulation studies through sensitivity and specificity (i.e., ability of the WQS method to select the bad actors correctly and not incorrect ones). We demonstrate the improvement in accuracy this method provides over traditional ordinary regression and shrinkage methods (lasso, adaptive lasso, and elastic net). Results from simulations demonstrate that WQS regression is accurate under some environmentally relevant conditions, but its accuracy decreases for a fixed correlation pattern as the association with a response variable diminishes. Nonzero weights (i.e., weights exceeding a selection threshold parameter) may be used to identify bad actors; however, components within a cluster of highly correlated active components tend to have lower weights, with the sum of their weights representative of the set.

摘要

在风险评估中,环境化学物质混合物对常见不良后果的影响备受关注。然而,由于共同出现的化学物质之间存在高维度和内在相关性,传统方法(如普通回归或逻辑回归)存在共线性和方差膨胀问题,而收缩方法在相关成分的选择上也有局限性。我们提出一种加权分位数和(WQS)方法来估计身体负担指数,该方法能在一组高度相关的环境化学物质中识别出“不良因素”。我们通过广泛的模拟研究,从敏感性和特异性(即WQS方法正确选择不良因素而非错误因素的能力)方面评估并刻画了WQS回归在变量选择中的准确性。我们展示了该方法相较于传统普通回归和收缩方法(套索回归、自适应套索回归和弹性网络回归)在准确性上的提升。模拟结果表明,WQS回归在某些与环境相关的条件下是准确的,但对于固定的相关模式,随着与响应变量的关联减弱,其准确性会降低。非零权重(即超过选择阈值参数的权重)可用于识别不良因素;然而,高度相关的活性成分簇内的成分往往权重较低,其权重总和代表该集合。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/46e2/6261506/64259a82610c/nihms-993354-f0001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验