比较处理随访中复杂抽样调查数据缺失的统计方法。

Comparisons of statistical methods for handling attrition in a follow-up visit with complex survey sampling.

机构信息

Department of Biostatistics, University of North Carolina, Chapel Hill, North Carolina.

Department of Biostatistics and Bioinformatics, The Biostatistics Center, Milken Institute School of Public Health, The George Washington University, Rockville, Maryland.

出版信息

Stat Med. 2023 May 20;42(11):1641-1668. doi: 10.1002/sim.9692. Epub 2023 Mar 7.

DOI:10.1002/sim.9692

PMID:37183765

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10957339/

Abstract

Design-based analysis, which accounts for the design features of the study, is commonly used to conduct data analysis in studies with complex survey sampling, such as the Hispanic Community Health Study/Study of Latinos (HCHS/SOL). In this type of longitudinal study, attrition has often been a problem. Although there have been various statistical approaches proposed to handle attrition, such as inverse probability weighting (IPW), non-response cell weighting (NRCW), multiple imputation (MI), and full information maximum likelihood (FIML) approach, there has not been a systematic assessment of these methods to compare their performance in design-based analyses. In this article, we perform extensive simulation studies and compare the performance of different missing data methods in linear and generalized linear population models, and under different missing data mechanism. We find that the design-based analysis is able to produce valid estimation and statistical inference when the missing data are handled appropriately using IPW, NRCW, MI, or FIML approach under missing-completely-at-random or missing-at-random missing mechanism and when the missingness model is correctly specified or over-specified. We also illustrate the use of these methods using data from HCHS/SOL.

摘要

基于设计的分析考虑了研究的设计特征，常用于进行复杂调查抽样研究中的数据分析，例如西班牙裔社区健康研究/拉丁裔研究（HCHS/SOL）。在这种纵向研究中，流失一直是一个问题。尽管已经提出了各种统计方法来处理流失，如逆概率加权（IPW）、非响应单元加权（NRCW）、多重插补（MI）和完全信息最大似然（FIML）方法，但尚未对这些方法进行系统评估，以比较它们在基于设计的分析中的性能。在本文中，我们进行了广泛的模拟研究，并比较了不同缺失数据方法在线性和广义线性总体模型中的性能，以及在不同缺失数据机制下的性能。我们发现，当使用 IPW、NRCW、MI 或 FIML 方法适当地处理缺失数据，并且在缺失完全随机或缺失随机缺失机制下以及在缺失模型正确指定或过度指定的情况下，基于设计的分析能够产生有效的估计和统计推断。我们还使用 HCHS/SOL 中的数据说明了这些方法的使用。

相似文献

Comparisons of statistical methods for handling attrition in a follow-up visit with complex survey sampling.

Stat Med. 2023 May 20;42(11):1641-1668. doi: 10.1002/sim.9692. Epub 2023 Mar 7.

On the use of multiple imputation to address data missing by design as well as unintended missing data in case-cohort studies with a binary endpoint.

BMC Med Res Methodol. 2023 Dec 7;23(1):287. doi: 10.1186/s12874-023-02090-5.

Evaluation of predictive model performance of an existing model in the presence of missing data.

Stat Med. 2021 Jul 10;40(15):3477-3498. doi: 10.1002/sim.8978. Epub 2021 Apr 11.

Attrition Bias Related to Missing Outcome Data: A Longitudinal Simulation Study.

Epidemiology. 2018 Jan;29(1):87-95. doi: 10.1097/EDE.0000000000000755.

Responsiveness-informed multiple imputation and inverse probability-weighting in cohort studies with missing data that are non-monotone or not missing at random.

Stat Methods Med Res. 2018 Feb;27(2):352-363. doi: 10.1177/0962280216628902. Epub 2016 Mar 16.

Comparison between inverse-probability weighting and multiple imputation in Cox model with missing failure subtype.

Stat Methods Med Res. 2024 Feb;33(2):344-356. doi: 10.1177/09622802231226328. Epub 2024 Jan 23.

Modeling longitudinal change in biomarkers using data from a complex survey sampling design: An application to the Hispanic Community Health Study/Study of Latinos.

Stat Med. 2023 Feb 28;42(5):632-655. doi: 10.1002/sim.9635. Epub 2023 Jan 11.

Evaluation of multiple imputation approaches for handling missing covariate information in a case-cohort study with a binary outcome.

BMC Med Res Methodol. 2022 Apr 3;22(1):87. doi: 10.1186/s12874-021-01495-4.

Analytical results in longitudinal studies depended on target of inference and assumed mechanism of attrition.

J Clin Epidemiol. 2015 Oct;68(10):1165-75. doi: 10.1016/j.jclinepi.2015.03.011. Epub 2015 Mar 31.

Combining multiple imputation and inverse-probability weighting.

Biometrics. 2012 Mar;68(1):129-37. doi: 10.1111/j.1541-0420.2011.01666.x. Epub 2011 Nov 3.

引用本文的文献

Health-relevant personality traits are associated with measures of health, well-being, stress and psychosocial work environment over time.

PLoS One. 2024 Dec 13;19(12):e0314321. doi: 10.1371/journal.pone.0314321. eCollection 2024.

Generational Immigration Status Modifies the Association Between Psychosocial Distress and Substance Use Among Alternative High School Students.

J Adolesc Health. 2024 Oct;75(4):610-619. doi: 10.1016/j.jadohealth.2024.06.004. Epub 2024 Jul 26.

Trait and situation-specific intolerance of uncertainty predict affective symptoms during the COVID-19 pandemic.

J Affect Disord. 2024 May 1;352:115-124. doi: 10.1016/j.jad.2024.02.010. Epub 2024 Feb 12.

Method comparison and estimation of causal effects of insomnia on health outcomes in a survey sampled population.

Sci Rep. 2023 Jun 17;13(1):9831. doi: 10.1038/s41598-023-36927-2.

本文引用的文献

An empirical evaluation of alternative approaches to adjusting for attrition when analyzing longitudinal survey data on young adults' substance use trajectories.

Int J Methods Psychiatr Res. 2022 Sep;31(3):e1916. doi: 10.1002/mpr.1916. Epub 2022 May 18.

Review of inverse probability weighting for dealing with missing data.

Stat Methods Med Res. 2013 Jun;22(3):278-95. doi: 10.1177/0962280210395740. Epub 2011 Jan 10.

Sample design and cohort selection in the Hispanic Community Health Study/Study of Latinos.

Ann Epidemiol. 2010 Aug;20(8):642-9. doi: 10.1016/j.annepidem.2010.05.006.

Design and implementation of the Hispanic Community Health Study/Study of Latinos.

Ann Epidemiol. 2010 Aug;20(8):629-41. doi: 10.1016/j.annepidem.2010.03.015.

On weighting the rates in non-response weights.

Stat Med. 2003 May 15;22(9):1589-99. doi: 10.1002/sim.1513.

The impact of nonnormality on full information maximum-likelihood estimation for structural equation models with missing data.

Psychol Methods. 2001 Dec;6(4):352-70.

Handling missing data in survey research.

Stat Methods Med Res. 1996 Sep;5(3):215-38. doi: 10.1177/096228029600500302.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

比较处理随访中复杂抽样调查数据缺失的统计方法。

Comparisons of statistical methods for handling attrition in a follow-up visit with complex survey sampling.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献