Czarnota Jenna, Gennings Chris, Wheeler David C
Department of Biostatistics, School of Medicine, Virginia Commonwealth University, Richmond, VA, USA.
Department of Preventive Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, USA.
Cancer Inform. 2015 May 13;14(Suppl 2):159-71. doi: 10.4137/CIN.S17295. eCollection 2015.
In evaluation of cancer risk related to environmental chemical exposures, the effect of many chemicals on disease is ultimately of interest. However, because of potentially strong correlations among chemicals that occur together, traditional regression methods suffer from collinearity effects, including regression coefficient sign reversal and variance inflation. In addition, penalized regression methods designed to remediate collinearity may have limitations in selecting the truly bad actors among many correlated components. The recently proposed method of weighted quantile sum (WQS) regression attempts to overcome these problems by estimating a body burden index, which identifies important chemicals in a mixture of correlated environmental chemicals. Our focus was on assessing through simulation studies the accuracy of WQS regression in detecting subsets of chemicals associated with health outcomes (binary and continuous) in site-specific analyses and in non-site-specific analyses. We also evaluated the performance of the penalized regression methods of lasso, adaptive lasso, and elastic net in correctly classifying chemicals as bad actors or unrelated to the outcome. We based the simulation study on data from the National Cancer Institute Surveillance Epidemiology and End Results Program (NCI-SEER) case-control study of non-Hodgkin lymphoma (NHL) to achieve realistic exposure situations. Our results showed that WQS regression had good sensitivity and specificity across a variety of conditions considered in this study. The shrinkage methods had a tendency to incorrectly identify a large number of components, especially in the case of strong association with the outcome.
在评估与环境化学物质暴露相关的癌症风险时,许多化学物质对疾病的影响最终是人们所关注的。然而,由于共同出现的化学物质之间可能存在强相关性,传统回归方法会受到共线性效应的影响,包括回归系数符号反转和方差膨胀。此外,旨在纠正共线性的惩罚回归方法在从许多相关成分中挑选出真正的有害因素时可能存在局限性。最近提出的加权分位数和(WQS)回归方法试图通过估计一个身体负担指数来克服这些问题,该指数可识别相关环境化学物质混合物中的重要化学物质。我们的重点是通过模拟研究评估WQS回归在特定地点分析和非特定地点分析中检测与健康结果(二元和连续)相关的化学物质子集的准确性。我们还评估了套索回归、自适应套索回归和弹性网络等惩罚回归方法在正确将化学物质分类为有害因素或与结果无关方面的性能。我们基于美国国家癌症研究所监测、流行病学和最终结果计划(NCI - SEER)非霍奇金淋巴瘤(NHL)病例对照研究的数据进行模拟研究,以实现现实的暴露情况。我们的结果表明,在本研究考虑的各种条件下,WQS回归具有良好的敏感性和特异性。收缩方法往往会错误地识别大量成分,尤其是在与结果有强关联的情况下。