一种处理缺失的多水平结局数据的双重稳健方法及其在中国健康与营养调查中的应用。

A doubly robust method to handle missing multilevel outcome data with application to the China Health and Nutrition Survey.

机构信息

The Biostatistics Center, Department of Biostatistics and Bioinformatics, Milken Institute School of Public Health, The George Washington University, Rockville, Maryland, USA.

Department of Biostatistics, Gillings School of Global Public Health, The University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA.

出版信息

Stat Med. 2022 Feb 20;41(4):769-785. doi: 10.1002/sim.9260. Epub 2021 Nov 16.

DOI:10.1002/sim.9260

PMID:34786739

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8795489/

Abstract

Missing data are common in longitudinal cohort studies and can lead to bias, particularly in studies with informative missingness. Many common methods for handling informatively missing data in survey samples require correctly specifying a model for missingness. Although doubly robust methods exist to provide unbiased regression coefficients in the presence of missing outcome data, these methods do not account for correlation due to clustering inherent in longitudinal or cluster-sampled studies. In this work, we developed a doubly robust method to estimate the regression of an outcome on a predictor in the presence of missing multilevel data on the outcome, which results in consistent estimation of regression coefficients assuming correct specification of either (1) the probability of missingness or (2) the outcome model. This method involves specification of separate hierarchical models for missingness and for the outcome, conditional on observed auxiliary variables and cluster-specific random effects, to account for correlation among observations. We showed this proposed estimator is doubly robust and derived its asymptotic distribution, conducted simulation studies to compare the method to an existing doubly robust method developed for independent data, and applied the method to data from the China Health and Nutrition Survey, an ongoing multilevel longitudinal cohort study.

摘要

缺失数据在纵向队列研究中很常见，可能导致偏倚，特别是在信息缺失的研究中。许多常见的用于处理调查样本中信息缺失数据的方法都需要正确指定缺失模型。尽管存在双稳健方法来提供在缺失结局数据情况下无偏的回归系数，但这些方法没有考虑到纵向或聚类抽样研究中固有的聚类相关性。在这项工作中，我们开发了一种双稳健方法，用于在缺失多层次结局数据的情况下，对结局进行回归分析，在正确指定（1）缺失概率或（2）结局模型的情况下，该方法可以一致地估计回归系数。该方法涉及为缺失和结局分别指定层次模型，条件是观察到的辅助变量和聚类特定的随机效应，以解释观察值之间的相关性。我们证明了这个提出的估计器是双稳健的，并推导出了它的渐近分布，通过模拟研究将该方法与为独立数据开发的现有双稳健方法进行了比较，并将该方法应用于中国健康与营养调查的数据，这是一项正在进行的多层次纵向队列研究。

相似文献

A doubly robust method to handle missing multilevel outcome data with application to the China Health and Nutrition Survey.

Stat Med. 2022 Feb 20;41(4):769-785. doi: 10.1002/sim.9260. Epub 2021 Nov 16.

Doubly robust estimation of generalized partial linear models for longitudinal data with dropouts.

Biometrics. 2017 Dec;73(4):1132-1139. doi: 10.1111/biom.12703. Epub 2017 Apr 3.

Doubly robust inference for targeted minimum loss-based estimation in randomized trials with missing outcome data.

Stat Med. 2017 Oct 30;36(24):3807-3819. doi: 10.1002/sim.7389. Epub 2017 Jul 25.

A Two-Step Approach for Analysis of Nonignorable Missing Outcomes in Longitudinal Regression: an Application to Upstate KIDS Study.

Paediatr Perinat Epidemiol. 2017 Sep;31(5):468-478. doi: 10.1111/ppe.12382. Epub 2017 Aug 2.

Data-Adaptive Bias-Reduced Doubly Robust Estimation.

Int J Biostat. 2016 May 1;12(1):253-82. doi: 10.1515/ijb-2015-0029.

Explicating the Conditions Under Which Multilevel Multiple Imputation Mitigates Bias Resulting from Random Coefficient-Dependent Missing Longitudinal Data.

Prev Sci. 2017 Jan;18(1):12-19. doi: 10.1007/s11121-016-0735-3.

Causal inference with noisy data: Bias analysis and estimation approaches to simultaneously addressing missingness and misclassification in binary outcomes.

Stat Med. 2020 Feb 20;39(4):456-468. doi: 10.1002/sim.8419. Epub 2019 Dec 5.

Folic acid supplementation and malaria susceptibility and severity among people taking antifolate antimalarial drugs in endemic areas.

Cochrane Database Syst Rev. 2022 Feb 1;2(2022):CD014217. doi: 10.1002/14651858.CD014217.

Handling missing data when estimating causal effects with targeted maximum likelihood estimation.

Am J Epidemiol. 2024 Jul 8;193(7):1019-1030. doi: 10.1093/aje/kwae012.

Leveraging auxiliary data to improve precision in inverse probability-weighted analyses.

Ann Epidemiol. 2022 Oct;74:75-83. doi: 10.1016/j.annepidem.2022.07.011. Epub 2022 Aug 5.

引用本文的文献

A commentary on 'The obesity challenge in joint replacement: a multifaceted analysis of self-reported health status and exercise capacity using NHANES data - a population-based study'.

Int J Surg. 2024 Aug 1;110(8):5244-5245. doi: 10.1097/JS9.0000000000001523.

Leveraging informative missing data to learn about acute respiratory distress syndrome and mortality in long-term hospitalized COVID-19 patients throughout the years of the pandemic.

AMIA Annu Symp Proc. 2024 Jan 11;2023:942-950. eCollection 2023.

Leveraging informative missing data to learn about acute respiratory distress syndrome and mortality in long-term hospitalized COVID-19 patients throughout the years of the pandemic.

medRxiv. 2023 Dec 19:2023.12.18.23300181. doi: 10.1101/2023.12.18.23300181.

Informative missingness: What can we learn from patterns in missing laboratory data in the electronic health record?

J Biomed Inform. 2023 Mar;139:104306. doi: 10.1016/j.jbi.2023.104306. Epub 2023 Feb 3.

本文引用的文献

Discussion of "Data-driven confounder selection via Markov and Bayesian networks" by Jenny Häggström.

Biometrics. 2018 Jun;74(2):399-402. doi: 10.1111/biom.12787. Epub 2017 Nov 2.

Improved double-robust estimation in missing data and causal inference models.

Biometrika. 2012 Jun;99(2):439-456. doi: 10.1093/biomet/ass013. Epub 2012 Apr 29.

Propensity score weighting with multilevel data.

Stat Med. 2013 Aug 30;32(19):3373-87. doi: 10.1002/sim.5786. Epub 2013 Mar 24.

The expanding burden of cardiometabolic risk in China: the China Health and Nutrition Survey.

Obes Rev. 2012 Sep;13(9):810-21. doi: 10.1111/j.1467-789X.2012.01016.x. Epub 2012 Jun 28.

Targeted maximum likelihood based causal inference: Part I.

Int J Biostat. 2010;6(2):Article 2. doi: 10.2202/1557-4679.1211.

Understanding community context and adult health changes in China: development of an urbanicity scale.

Soc Sci Med. 2010 Oct;71(8):1436-46. doi: 10.1016/j.socscimed.2010.07.027. Epub 2010 Aug 11.

Cohort Profile: The China Health and Nutrition Survey--monitoring and understanding socio-economic and health change in China, 1989-2011.

Int J Epidemiol. 2010 Dec;39(6):1435-40. doi: 10.1093/ije/dyp322. Epub 2009 Nov 3.

Adjustment for missingness using auxiliary information in semiparametric regression.

Biometrics. 2010 Mar;66(1):115-22. doi: 10.1111/j.1541-0420.2009.01231.x. Epub 2009 May 7.

Empirical efficiency maximization: improved locally efficient covariate adjustment in randomized experiments and survival analysis.

Int J Biostat. 2008;4(1):Article 5.

Comment: Demystifying Double Robustness: A Comparison of Alternative Strategies for Estimating a Population Mean from Incomplete Data.

Stat Sci. 2007;22(4):569-573. doi: 10.1214/07-STS227.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

一种处理缺失的多水平结局数据的双重稳健方法及其在中国健康与营养调查中的应用。

A doubly robust method to handle missing multilevel outcome data with application to the China Health and Nutrition Survey.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献