偏离可忽略样本选择程度的度量

Measures of the Degree of Departure from Ignorable Sample Selection.

作者信息

Little Roderick J A, West Brady T, Boonstra Philip S, Hu Jingwei

机构信息

Professor of Biostatistics at the School of Public Health and Research Professor in the Survey Methodology Program (SMP), Survey Research Center (SRC), Institute for Social Research (ISR), University of Michigan, 1420 Washington Heights, Ann Arbor, MI 48109-2029, USA.

Research Associate Professor in the Survey Methodology Program (SMP), Survey Research Center (SRC), Institute for Social Research (ISR), University of Michigan, 426 Thompson Street, Ann Arbor, MI 48106-1248, USA.

出版信息

J Surv Stat Methodol. 2020 Nov;8(5):932-964. doi: 10.1093/jssam/smz023. Epub 2019 Aug 29.

DOI:10.1093/jssam/smz023

PMID:33381610

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7750890/

Abstract

With the current focus of survey researchers on "big data" that are not selected by probability sampling, measures of the degree of potential sampling bias arising from this nonrandom selection are sorely needed. Existing indices of this degree of departure from probability sampling, like the R-indicator, are based on functions of the propensity of inclusion in the sample, estimated by modeling the inclusion probability as a function of auxiliary variables. These methods are agnostic about the relationship between the inclusion probability and survey outcomes, which is a crucial feature of the problem. We propose a simple index of degree of departure from ignorable sample selection that corrects this deficiency, which we call the standardized measure of unadjusted bias (SMUB). The index is based on normal pattern-mixture models for nonresponse applied to this sample selection problem and is grounded in the model-based framework of nonignorable selection first proposed in the context of nonresponse by Don Rubin in 1976. The index depends on an inestimable parameter that measures the deviation from selection at random, which ranges between the values zero and one. We propose the use of a central value of this parameter, 0.5, for computing a point index, and computing the values of SMUB at zero and one to provide a range of the index in a sensitivity analysis. We also provide a fully Bayesian approach for computing credible intervals for the SMUB, reflecting uncertainty in the values of all of the input parameters. The proposed methods have been implemented in R and are illustrated using real data from the National Survey of Family Growth.

摘要

鉴于当前调查研究人员关注的是未通过概率抽样选取的“大数据”，因此迫切需要衡量这种非随机选择所产生的潜在抽样偏差程度的方法。现有的衡量偏离概率抽样程度的指标，如R指标，是基于样本包含倾向的函数，通过将包含概率建模为辅助变量的函数来估计。这些方法对包含概率与调查结果之间的关系不做考虑，而这是该问题的一个关键特征。我们提出了一个简单的衡量偏离可忽略样本选择程度的指标，它纠正了这一缺陷，我们称之为未调整偏差的标准化度量（SMUB）。该指标基于应用于此样本选择问题的非应答的正态模式混合模型，并基于1976年唐·鲁宾在非应答背景下首次提出的基于模型的非可忽略选择框架。该指标依赖于一个不可估计的参数，该参数衡量与随机选择的偏差，取值范围在0到1之间。我们建议使用该参数的中心值0.5来计算一个点指标，并计算SMUB在0和1时的值，以便在敏感性分析中提供该指标的范围。我们还提供了一种完全贝叶斯方法来计算SMUB的可信区间，反映所有输入参数值的不确定性。所提出的方法已在R语言中实现，并使用来自全国家庭成长调查的实际数据进行了说明。

相似文献

Measures of the Degree of Departure from Ignorable Sample Selection.

J Surv Stat Methodol. 2020 Nov;8(5):932-964. doi: 10.1093/jssam/smz023. Epub 2019 Aug 29.

Indices of non-ignorable selection bias for proportions estimated from non-probability samples.

J R Stat Soc Ser C Appl Stat. 2019 Nov;68(5):1465-1483. doi: 10.1111/rssc.12371. Epub 2019 Aug 2.

A simulation study of diagnostics for selection bias.

J Off Stat. 2021 Sep;37(3):751-769. doi: 10.2478/jos-2021-0033. Epub 2021 Sep 12.

ASSESSING SELECTION BIAS IN REGRESSION COEFFICIENTS ESTIMATED FROM NONPROBABILITY SAMPLES WITH APPLICATIONS TO GENETICS AND DEMOGRAPHIC SURVEYS.

Ann Appl Stat. 2021 Sep;15(3):1556-1581. doi: 10.1214/21-aoas1453. Epub 2021 Sep 23.

Estimation of regression models for the mean of repeated outcomes under nonignorable nonmonotone nonresponse.

Biometrika. 2007 Dec;94(4):841-860. doi: 10.1093/biomet/asm070.

Impact of nonignorable coarsening on Bayesian inference.

Biostatistics. 2007 Oct;8(4):722-43. doi: 10.1093/biostatistics/kxm001. Epub 2007 Jan 10.

Occupancy modeling species-environment relationships with non-ignorable survey designs.

Ecol Appl. 2018 Sep;28(6):1616-1625. doi: 10.1002/eap.1754. Epub 2018 Jul 19.

A two-phase sampling survey for nonresponse and its paradata to correct nonresponse bias in a health surveillance survey.

Rev Epidemiol Sante Publique. 2017 Feb;65(1):71-79. doi: 10.1016/j.respe.2016.10.059. Epub 2017 Jan 17.

A scalable approach to measuring the impact of nonignorable nonresponse with an EMA application.

Stat Med. 2016 Dec 30;35(30):5579-5602. doi: 10.1002/sim.7078. Epub 2016 Aug 18.

Non-ignorable missingness in logistic regression.

Stat Med. 2017 Aug 30;36(19):3005-3021. doi: 10.1002/sim.7349. Epub 2017 Jun 2.

引用本文的文献

On the Use of Auxiliary Variables in Multilevel Regression and Poststratification.

Stat Sci. 2025 May;40(2):272-288. doi: 10.1214/24-sts932. Epub 2025 Jun 2.

Analyzing Potential Non-Ignorable Selection Bias in an Off-Wave Mail Survey Implemented in a Long-Standing Panel Study.

J Surv Stat Methodol. 2024 Oct 23;13(1):100-127. doi: 10.1093/jssam/smae039. eCollection 2025 Feb.

Addressing Selection Biases within Electronic Health Record Data for Estimation of Diabetes Prevalence among New York City Young Adults: A Cross-Sectional Study.

BMJ Public Health. 2024;2(2). doi: 10.1136/bmjph-2024-001666.

Exploring the Big Data Paradox for various estimands using vaccination data from the global COVID-19 Trends and Impact Survey (CTIS).

Sci Adv. 2024 May 31;10(22):eadj0266. doi: 10.1126/sciadv.adj0266.

Bias correction models for electronic health records data in the presence of non-random sampling.

Biometrics. 2024 Jan 29;80(1). doi: 10.1093/biomtc/ujae014.

Evaluating Pre-election Polling Estimates Using a New Measure of Non-ignorable Selection Bias.

Public Opin Q. 2023 Jun 8;87(Suppl 1):575-601. doi: 10.1093/poq/nfad018. eCollection 2023.

ASSESSING SELECTION BIAS IN REGRESSION COEFFICIENTS ESTIMATED FROM NONPROBABILITY SAMPLES WITH APPLICATIONS TO GENETICS AND DEMOGRAPHIC SURVEYS.

Ann Appl Stat. 2021 Sep;15(3):1556-1581. doi: 10.1214/21-aoas1453. Epub 2021 Sep 23.

Unrepresentative big surveys significantly overestimated US vaccine uptake.

Nature. 2021 Dec;600(7890):695-700. doi: 10.1038/s41586-021-04198-4. Epub 2021 Dec 8.

A simulation study of diagnostics for selection bias.

J Off Stat. 2021 Sep;37(3):751-769. doi: 10.2478/jos-2021-0033. Epub 2021 Sep 12.

Indices of non-ignorable selection bias for proportions estimated from non-probability samples.

J R Stat Soc Ser C Appl Stat. 2019 Nov;68(5):1465-1483. doi: 10.1111/rssc.12371. Epub 2019 Aug 2.

本文引用的文献

New options for national population surveys: The implications of internet and smartphone coverage.

Soc Sci Res. 2018 Jul;73:221-235. doi: 10.1016/j.ssresearch.2018.03.008. Epub 2018 Mar 20.

Using Twitter for Demographic and Social Science Research: Tools for Data Collection and Processing.

Sociol Methods Res. 2017 Aug;46(3):390-421. doi: 10.1177/0049124115605339. Epub 2015 Oct 9.

Alternative indicators for the risk of non-response bias: a simulation study.

Int Stat Rev. 2016 Apr;84(1):43-62. doi: 10.1111/insr.12100. Epub 2015 Mar 25.

What are health-related users tweeting? A qualitative content analysis of health-related users and their messages on twitter.

J Med Internet Res. 2014 Oct 15;16(10):e237. doi: 10.2196/jmir.3765.

The reliability of tweets as a supplementary method of seasonal influenza surveillance.

J Med Internet Res. 2014 Nov 14;16(11):e250. doi: 10.2196/jmir.3532.

Use of Twitter to monitor attitudes toward depression and schizophrenia: an exploratory study.

PeerJ. 2014 Oct 28;2:e647. doi: 10.7717/peerj.647. eCollection 2014.

A case study of the New York City 2012-2013 influenza season with daily geocoded Twitter data from temporal and spatiotemporal perspectives.

J Med Internet Res. 2014 Oct 20;16(10):e236. doi: 10.2196/jmir.3416.

Tweeting for and against public health policy: response to the Chicago Department of Public Health's electronic cigarette Twitter campaign.

J Med Internet Res. 2014 Oct 16;16(10):e238. doi: 10.2196/jmir.3622.

Tweet content related to sexually transmitted diseases: no joking matter.

J Med Internet Res. 2014 Oct 6;16(10):e228. doi: 10.2196/jmir.3259.

Social media and palliative medicine: a retrospective 2-year analysis of global Twitter data to evaluate the use of technology to communicate about issues at the end of life.

BMJ Support Palliat Care. 2015 Jun;5(2):207-12. doi: 10.1136/bmjspcare-2014-000701. Epub 2014 Sep 2.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

偏离可忽略样本选择程度的度量

Measures of the Degree of Departure from Ignorable Sample Selection.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献