在病例对照研究中评估相关暴露对健康影响的变量选择方法的性能。

Performance of variable selection methods for assessing the health effects of correlated exposures in case-control studies.

机构信息

Division of Environmental Epidemiology, Institute for Risk Assessment Sciences, Utrecht University, Utrecht, The Netherlands.

Departmentof Epidemiology, Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht, The Netherlands.

出版信息

Occup Environ Med. 2018 Jul;75(7):522-529. doi: 10.1136/oemed-2016-104231. Epub 2017 Sep 25.

DOI:10.1136/oemed-2016-104231

PMID:28947495

Abstract

OBJECTIVES

There is growing recognition that simultaneously assessing multiple exposures may reduce false positive discoveries and improve epidemiological effect estimates. We evaluated the performance of statistical methods for identifying exposure-outcome associations across various data structures typical of environmental and occupational epidemiology analyses.

METHODS

We simulated a case-control study, generating 100 data sets for each of 270 different simulation scenarios; varying the number of exposure variables, the correlation between exposures, sample size, the number of effective exposures and the magnitude of effect estimates. We compared conventional analytical approaches, that is, univariable (with and without multiplicity adjustment), multivariable and stepwise logistic regression, with variable selection methods: sparse partial least squares discriminant analysis, boosting, and frequentist and Bayesian penalised regression approaches.

RESULTS

The variable selection methods consistently yielded more precise effect estimates and generally improved selection accuracy compared with conventional logistic regression methods, especially for scenarios with higher correlation levels. Penalised lasso and elastic net regression both seemed to perform particularly well, specifically when statistical inference based on a balanced weighting of high sensitivity and a low proportion of false discoveries is sought.

CONCLUSIONS

In this extensive simulation study with multicollinear data, we found that most variable selection methods consistently outperformed conventional approaches, and demonstrated how performance is influenced by the structure of the data and underlying model.

摘要

目的

越来越多的人认识到，同时评估多种暴露因素可能会减少假阳性发现，并提高流行病学效应估计的准确性。我们评估了用于识别各种环境和职业流行病学分析中典型数据结构的暴露-结局关联的统计方法的性能。

方法

我们模拟了一项病例对照研究，为 270 种不同模拟情况中的每一种生成了 100 个数据集；改变暴露因素的数量、暴露因素之间的相关性、样本量、有效暴露因素的数量和效应估计值的大小。我们比较了传统的分析方法，即单变量（有无多重性调整）、多变量和逐步逻辑回归，以及变量选择方法：稀疏偏最小二乘判别分析、提升法和频率派及贝叶斯惩罚回归方法。

结果

与传统的逻辑回归方法相比，变量选择方法始终产生更精确的效应估计值，并且通常提高了选择准确性，尤其是在相关性水平较高的情况下。惩罚最小二乘法和弹性网络回归似乎都表现得特别好，特别是在寻求基于高灵敏度和低假阳性发现比例的平衡加权的统计推断时。

结论

在这项具有多重共线性数据的广泛模拟研究中，我们发现大多数变量选择方法始终优于传统方法，并展示了性能如何受到数据结构和基础模型的影响。

相似文献

Performance of variable selection methods for assessing the health effects of correlated exposures in case-control studies.在病例对照研究中评估相关暴露对健康影响的变量选择方法的性能。

Occup Environ Med. 2018 Jul;75(7):522-529. doi: 10.1136/oemed-2016-104231. Epub 2017 Sep 25.

Part 1. Statistical Learning Methods for the Effects of Multiple Air Pollution Constituents.第1部分. 多种空气污染成分影响的统计学习方法

Res Rep Health Eff Inst. 2015 Jun(183 Pt 1-2):5-50.

Effects of short-term exposure to air pollution on hospital admissions of young children for acute lower respiratory infections in Ho Chi Minh City, Vietnam.越南胡志明市短期暴露于空气污染对幼儿急性下呼吸道感染住院率的影响。

Res Rep Health Eff Inst. 2012 Jun(169):5-72; discussion 73-83.

Part 2. Development of Enhanced Statistical Methods for Assessing Health Effects Associated with an Unknown Number of Major Sources of Multiple Air Pollutants.第2部分。开发增强的统计方法，以评估与多种空气污染物的未知数量主要来源相关的健康影响。

Res Rep Health Eff Inst. 2015 Jun(183 Pt 1-2):51-113.

Statistical software for analyzing the health effects of multiple concurrent exposures via Bayesian kernel machine regression.贝叶斯核机器回归分析多种并发暴露对健康影响的统计软件。

Environ Health. 2018 Aug 20;17(1):67. doi: 10.1186/s12940-018-0413-y.

Two-step approach for assessing the health effects of environmental chemical mixtures: application to simulated datasets and real data from the Navajo Birth Cohort Study.两步法评估环境化学混合物的健康效应：在模拟数据集和纳瓦霍出生队列研究的真实数据中的应用。

Environ Health. 2019 May 9;18(1):46. doi: 10.1186/s12940-019-0482-6.

Performance of variable and function selection methods for estimating the nonlinear health effects of correlated chemical mixtures: A simulation study.用于估计相关化学混合物非线性健康影响的变量和函数选择方法的性能：一项模拟研究。

Stat Med. 2020 Nov 30;39(27):3947-3967. doi: 10.1002/sim.8701. Epub 2020 Sep 17.

A systematic comparison of statistical methods to detect interactions in exposome-health associations.用于检测暴露组-健康关联中相互作用的统计方法的系统比较。

Environ Health. 2017 Jul 14;16(1):74. doi: 10.1186/s12940-017-0277-6.

Statistical strategies for constructing health risk models with multiple pollutants and their interactions: possible choices and comparisons.统计学策略用于构建具有多种污染物及其相互作用的健康风险模型：可能的选择和比较。

Environ Health. 2013 Oct 4;12(1):85. doi: 10.1186/1476-069X-12-85.

Assessing the cumulative health effect following short term exposure to multiple pollutants: An evaluation of methodological approaches using simulations and real data.评估短期暴露于多种污染物后的累积健康效应：使用模拟和真实数据评估方法学方法。

Environ Res. 2018 Aug;165:228-234. doi: 10.1016/j.envres.2018.04.021. Epub 2018 May 1.

引用本文的文献

Investigating Exposure and Hazards of Micro- and Nanoplastics During Pregnancy and Early Life (AURORA Project): Protocol for an Interdisciplinary Study.研究孕期和生命早期微塑料和纳米塑料的暴露和危害（AURORA 项目）：一项跨学科研究的方案。

JMIR Res Protoc. 2024 Oct 8;13:e63176. doi: 10.2196/63176.

Nutritional Modulation of Associations between Prenatal Exposure to Persistent Organic Pollutants and Childhood Obesity: A Prospective Cohort Study.孕期暴露于持久性有机污染物与儿童肥胖的关联的营养调控：一项前瞻性队列研究。

Environ Health Perspect. 2023 Mar;131(3):37011. doi: 10.1289/EHP11258. Epub 2023 Mar 16.

Socioexposomics of COVID-19 across New Jersey: a comparison of geostatistical and machine learning approaches.新泽西州 COVID-19 的社会暴露组学：地理统计学和机器学习方法的比较。

J Expo Sci Environ Epidemiol. 2024 Mar;34(2):197-207. doi: 10.1038/s41370-023-00518-0. Epub 2023 Feb 1.

A Strategy for Field Evaluations of Exposures and Respiratory Health of Workers at Small- to Medium-Sized Coffee Facilities.中小咖啡设施中工人暴露与呼吸健康的现场评估策略。

Front Public Health. 2021 Nov 11;9:705225. doi: 10.3389/fpubh.2021.705225. eCollection 2021.

Exposure to Perfluoroalkyl Substances During Pregnancy and Fetal BDNF Level: A Prospective Cohort Study.妊娠期间接触全氟烷基物质与胎儿 BDNF 水平：一项前瞻性队列研究。

Front Endocrinol (Lausanne). 2021 Jun 1;12:653095. doi: 10.3389/fendo.2021.653095. eCollection 2021.

Childhood Adversity Trajectories and Violent Behaviors in Adolescence and Early Adulthood.童年逆境轨迹与青少年和成年早期的暴力行为。

J Interpers Violence. 2022 Aug;37(15-16):NP13978-NP14007. doi: 10.1177/08862605211006366. Epub 2021 Apr 16.

Identification of high-dimensional omics-derived predictors for tumor growth dynamics using machine learning and pharmacometric modeling.使用机器学习和药效计量学建模鉴定高维组学生物标志物以预测肿瘤生长动态。

CPT Pharmacometrics Syst Pharmacol. 2021 Apr;10(4):350-361. doi: 10.1002/psp4.12603. Epub 2021 Apr 8.

The Exposome Approach to Decipher the Role of Multiple Environmental and Lifestyle Determinants in Asthma.外核组学方法解析多种环境和生活方式决定因素在哮喘中的作用。

Int J Environ Res Public Health. 2021 Jan 28;18(3):1138. doi: 10.3390/ijerph18031138.

The major effects of health-related quality of life on 5-year survival prediction among lung cancer survivors: applications of machine learning.基于机器学习的肺癌幸存者 5 年生存预测中健康相关生活质量的主要影响：应用研究。

Sci Rep. 2020 Jul 1;10(1):10693. doi: 10.1038/s41598-020-67604-3.

Using methylome data to inform exposome-health association studies: An application to the identification of environmental drivers of child body mass index.利用甲基化组学数据来为暴露组-健康关联研究提供信息：以确定儿童体重指数的环境驱动因素为例。

Environ Int. 2020 May;138:105622. doi: 10.1016/j.envint.2020.105622. Epub 2020 Mar 14.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

在病例对照研究中评估相关暴露对健康影响的变量选择方法的性能。

Performance of variable selection methods for assessing the health effects of correlated exposures in case-control studies.

机构信息

出版信息

OBJECTIVES

METHODS

RESULTS

CONCLUSIONS

目的

方法

结果

结论

相似文献

引用本文的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献