Suppr超能文献

处理流行病学研究中缺失值、测量误差和混杂的方法。

Approaches to addressing missing values, measurement error, and confounding in epidemiologic studies.

机构信息

Department of Clinical Epidemiology, Leiden University Medical Center, Leiden, The Netherlands.

Department of Clinical Epidemiology, Leiden University Medical Center, Leiden, The Netherlands.

出版信息

J Clin Epidemiol. 2021 Mar;131:89-100. doi: 10.1016/j.jclinepi.2020.11.006. Epub 2020 Nov 8.

Abstract

OBJECTIVES

Epidemiologic studies often suffer from incomplete data, measurement error (or misclassification), and confounding. Each of these can cause bias and imprecision in estimates of exposure-outcome relations. We describe and compare statistical approaches that aim to control all three sources of bias simultaneously.

STUDY DESIGN AND SETTING

We illustrate four statistical approaches that address all three sources of bias, namely, multiple imputation for missing data and measurement error, multiple imputation combined with regression calibration, full information maximum likelihood within a structural equation modeling framework, and a Bayesian model. In a simulation study, we assess the performance of the four approaches compared with more commonly used approaches that do not account for measurement error, missing values, or confounding.

RESULTS

The results demonstrate that the four approaches consistently outperform the alternative approaches on all performance metrics (bias, mean squared error, and confidence interval coverage). Even in simulated data of 100 subjects, these approaches perform well.

CONCLUSION

There can be a large benefit of addressing measurement error, missing values, and confounding to improve the estimation of exposure-outcome relations, even when the available sample size is relatively small.

摘要

目的

流行病学研究常受到数据不完整、测量误差(或分类错误)和混杂因素的影响。这些因素都会导致暴露-结局关系的估计值产生偏差和不精确。我们描述并比较了旨在同时控制这三种偏倚源的统计方法。

研究设计和设置

我们举例说明了四种可同时解决所有三种偏倚源的统计方法,即:针对缺失数据和测量误差的多重插补、多重插补结合回归校正、结构方程建模框架内的完全信息极大似然法和贝叶斯模型。在一项模拟研究中,我们评估了这四种方法与那些不考虑测量误差、缺失值或混杂因素的常用方法相比的性能。

结果

结果表明,这四种方法在所有性能指标(偏差、均方误差和置信区间覆盖)上均优于替代方法。即使在 100 个受试者的模拟数据中,这些方法也表现良好。

结论

即使可用的样本量相对较小,解决测量误差、缺失值和混杂因素以改善暴露-结局关系的估计也会带来很大的益处。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验