基于大小与概率成比例的聚类抽样的贝叶斯推断。

Bayesian inference under cluster sampling with probability proportional to size.

机构信息

Department of Statistics, Columbia University, New York, New York.

Survey Research Center, University of Michigan, Ann Arbor, Michigan.

出版信息

Stat Med. 2018 Nov 20;37(26):3849-3868. doi: 10.1002/sim.7892. Epub 2018 Jul 4.

DOI:10.1002/sim.7892

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7993060/

Abstract

Cluster sampling is common in survey practice, and the corresponding inference has been predominantly design based. We develop a Bayesian framework for cluster sampling and account for the design effect in the outcome modeling. We consider a two-stage cluster sampling design where the clusters are first selected with probability proportional to cluster size, and then units are randomly sampled inside selected clusters. Challenges arise when the sizes of the nonsampled cluster are unknown. We propose nonparametric and parametric Bayesian approaches for predicting the unknown cluster sizes, with this inference performed simultaneously with the model for survey outcome, with computation performed in the open-source Bayesian inference engine Stan. Simulation studies show that the integrated Bayesian approach outperforms classical methods with efficiency gains, especially under informative cluster sampling design with small number of selected clusters. We apply the method to the Fragile Families and Child Wellbeing study as an illustration of inference for complex health surveys.

摘要

整群抽样在调查实践中很常见，相应的推断主要基于设计。我们为整群抽样开发了一个贝叶斯框架，并在结果建模中考虑了设计效果。我们考虑了两阶段整群抽样设计，其中首先以与群大小成比例的概率选择群，然后在选定的群内随机抽取单位。当未知的未抽样群的大小出现时，就会出现挑战。我们提出了非参数和参数贝叶斯方法来预测未知的群大小，这种推断与调查结果模型同时进行，计算在开源贝叶斯推理引擎 Stan 中进行。模拟研究表明，集成贝叶斯方法在效率上优于经典方法，尤其是在具有少量选定群的信息丰富的整群抽样设计下。我们将该方法应用于脆弱家庭和儿童福利研究，作为复杂健康调查推断的一个例子。

相似文献

1

Bayesian inference under cluster sampling with probability proportional to size.基于大小与概率成比例的聚类抽样的贝叶斯推断。

Stat Med. 2018 Nov 20;37(26):3849-3868. doi: 10.1002/sim.7892. Epub 2018 Jul 4.

2

Relative efficiencies of two-stage sampling schemes for mean estimation in multilevel populations when cluster size is informative.当群规模是有信息时，多水平总体中均值估计的两阶段抽样方案的相对效率。

Stat Med. 2019 May 10;38(10):1817-1834. doi: 10.1002/sim.8070. Epub 2018 Dec 21.

3

Optimal two-stage sampling for mean estimation in multilevel populations when cluster size is informative.多水平总体中当群大小为信息性时的均值估计的最优两阶段抽样。

Stat Methods Med Res. 2021 Feb;30(2):357-375. doi: 10.1177/0962280220952833. Epub 2020 Sep 17.

4

Bayesian evaluation of informative hypotheses in cluster-randomized trials.贝叶斯评价在整群随机试验中信息性假设。

Behav Res Methods. 2019 Feb;51(1):126-137. doi: 10.3758/s13428-018-1149-x.

5

A Bayesian hierarchical model for mortality data from cluster-sampling household surveys in humanitarian crises.用于人道主义危机中整群抽样家庭调查死亡数据的贝叶斯分层模型。

Int J Epidemiol. 2018 Aug 1;47(4):1255-1263. doi: 10.1093/ije/dyy088.

6

Consensus clustering for Bayesian mixture models.贝叶斯混合模型的一致性聚类。

BMC Bioinformatics. 2022 Jul 21;23(1):290. doi: 10.1186/s12859-022-04830-8.

7

Bayesian predictive inference for units with small sample sizes. The case of binary random variables.

Med Care. 1993 May;31(5 Suppl):YS66-70. doi: 10.1097/00005650-199305001-00010.

8

A Nonparametric Bayesian Model for Nested Clustering.用于嵌套聚类的非参数贝叶斯模型

Methods Mol Biol. 2016;1362:129-41. doi: 10.1007/978-1-4939-3106-4_8.

9

Bayesian Inference of Finite Population Quantiles for Skewed Survey Data Using Skew-Normal Penalized Spline Regression.使用偏态正态惩罚样条回归对偏态调查数据进行有限总体分位数的贝叶斯推断。

J Surv Stat Methodol. 2020 Sep;8(4):792-816. doi: 10.1093/jssam/smz016. Epub 2019 Sep 3.

10

Bayesian inference from count data using discrete uniform priors.使用离散均匀先验对计数数据进行贝叶斯推断。

PLoS One. 2013 Oct 7;8(10):e74388. doi: 10.1371/journal.pone.0074388. eCollection 2013.

引用本文的文献

1

On the Use of Auxiliary Variables in Multilevel Regression and Poststratification.关于多水平回归与事后分层中辅助变量的使用

Stat Sci. 2025 May;40(2):272-288. doi: 10.1214/24-sts932. Epub 2025 Jun 2.

2

Inferring Population HIV Viral Load From a Single HIV Clinic's Electronic Health Record: Simulation Study With a Real-World Example.从单一艾滋病诊所的电子健康记录推断人群艾滋病毒载量：基于真实案例的模拟研究

Online J Public Health Inform. 2024 Jul 3;16:e58058. doi: 10.2196/58058.

3

Bayesian estimation methods for survey data with potential applications to health disparities research.用于调查数据的贝叶斯估计方法及其在健康差异研究中的潜在应用。

Wiley Interdiscip Rev Comput Stat. 2024 Jan-Feb;16(1). doi: 10.1002/wics.1633. Epub 2023 Aug 28.

4

Embedded multilevel regression and poststratification: Model-based inference with incomplete auxiliary information.嵌入式多级回归和后分层：利用不完全辅助信息进行基于模型的推断。

Stat Med. 2024 Jan 30;43(2):256-278. doi: 10.1002/sim.9956. Epub 2023 Nov 15.

5

New generalized class of estimators for estimation of finite population mean based on probability proportional to size sampling using two auxiliary variables: A simulation study.基于使用两个辅助变量的与规模成比例概率抽样估计有限总体均值的新广义估计量类：一项模拟研究。

Sci Prog. 2023 Oct-Dec;106(4):368504231208537. doi: 10.1177/00368504231208537.

6

Using Small Area Prevalence Survey Methods to Conduct Blood Lead Assessments among Children.利用小区域流行率调查方法对儿童进行血铅评估。

Int J Environ Res Public Health. 2022 May 18;19(10):6151. doi: 10.3390/ijerph19106151.

7

Estimating seroprevalence of SARS-CoV-2 in Ohio: A Bayesian multilevel poststratification approach with multiple diagnostic tests.利用多重诊断检测的贝叶斯多级后分层方法估计俄亥俄州 SARS-CoV-2 的血清流行率。

Proc Natl Acad Sci U S A. 2021 Jun 29;118(26). doi: 10.1073/pnas.2023947118.

8

Women's Empowerment and HIV Testing Uptake: A Meta-analysis of Demographic and Health Surveys from 33 Sub-Saharan African Countries.妇女赋权与艾滋病毒检测接受情况：对撒哈拉以南非洲33个国家人口与健康调查的荟萃分析

Int J MCH AIDS. 2020;9(3):274-286. doi: 10.21106/ijma.372. Epub 2020 Jul 23.

9

Relative efficiencies of two-stage sampling schemes for mean estimation in multilevel populations when cluster size is informative.当群规模是有信息时，多水平总体中均值估计的两阶段抽样方案的相对效率。

Stat Med. 2019 May 10;38(10):1817-1834. doi: 10.1002/sim.8070. Epub 2018 Dec 21.

本文引用的文献

1

Multiple Imputation in Two-Stage Cluster Samples Using The Weighted Finite Population Bayesian Bootstrap.使用加权有限总体贝叶斯自助法对两阶段整群样本进行多重填补

J Surv Stat Methodol. 2016 Jun 1;4(2):139-170. doi: 10.1093/jssam/smv031. Epub 2016 Jan 31.

2

Quantifying the impact of fixed effects modeling of clusters in multiple imputation for cluster randomized trials.量化整群随机试验多重填补中整群固定效应建模的影响。

Biom J. 2011 Feb;53(1):57-74. doi: 10.1002/bimj.201000140.

3

Design of cross-sectional surveys using cluster sampling: an overview with Australian case studies.使用整群抽样的横断面调查设计：以澳大利亚案例研究为例的概述

Aust N Z J Public Health. 1999 Oct;23(5):546-51. doi: 10.1111/j.1467-842x.1999.tb01317.x.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

文档翻译

学术文献翻译模型，支持多种主流文档格式。