• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

利用行政记录改进调查推断而不发布个体层面的连续数据。

Improving Survey Inference Using Administrative Records Without Releasing Individual-Level Continuous Data.

作者信息

Williams Sharifa Z, Zou Jungang, Liu Yutao, Si Yajuan, Galea Sandro, Chen Qixuan

机构信息

Department of Biostatistics, Mailman School of Public Health, Columbia University, New York, New York, USA.

Edward J. Bloustein School of Planning and Public Policy, Rutgers University, New Brunswick, New Jersey, USA.

出版信息

Stat Med. 2024 Dec 30;43(30):5803-5813. doi: 10.1002/sim.10270. Epub 2024 Nov 18.

DOI:10.1002/sim.10270
PMID:39557420
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11639655/
Abstract

Probability surveys are challenged by increasing nonresponse rates, resulting in biased statistical inference. Auxiliary information about populations can be used to reduce bias in estimation. Often continuous auxiliary variables in administrative records are first discretized before releasing to the public to avoid confidentiality breaches. This may weaken the utility of the administrative records in improving survey estimates, particularly when there is a strong relationship between continuous auxiliary information and the survey outcome. In this paper, we propose a two-step strategy, where the confidential continuous auxiliary data in the population are first utilized to estimate the response propensity score of the survey sample by statistical agencies, which is then included in a modified population data for data users. In the second step, data users who do not have access to confidential continuous auxiliary data conduct predictive survey inference by including discretized continuous variables and the propensity score as predictors using splines in a Bayesian model. We show by simulation that the proposed method performs well, yielding more efficient estimates of population means with 95% credible intervals providing better coverage than alternative approaches. We illustrate the proposed method using the Ohio Army National Guard Mental Health Initiative (OHARNG-MHI). The methods developed in this work are readily available in the R package AuxSurvey.

摘要

概率调查面临着无回应率不断上升的挑战,这导致统计推断出现偏差。关于总体的辅助信息可用于减少估计偏差。行政记录中的连续辅助变量通常在向公众发布之前先进行离散化处理,以避免泄露机密。这可能会削弱行政记录在改进调查估计方面的效用,特别是当连续辅助信息与调查结果之间存在很强的关系时。在本文中,我们提出了一种两步策略,即统计机构首先利用总体中的机密连续辅助数据来估计调查样本的回应倾向得分,然后将其纳入供数据用户使用的修正总体数据中。在第二步中,无法获取机密连续辅助数据的数据用户通过在贝叶斯模型中使用样条将离散化的连续变量和倾向得分作为预测变量来进行预测性调查推断。我们通过模拟表明,所提出的方法表现良好,能更有效地估计总体均值,其95%的可信区间比其他方法提供了更好的覆盖范围。我们使用俄亥俄陆军国民警卫队心理健康倡议(OHARNG-MHI)来说明所提出的方法。这项工作中开发的方法在R包AuxSurvey中很容易获得。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2b7a/11639655/e78bcfaae0a8/SIM-43-5803-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2b7a/11639655/1d3058a20162/SIM-43-5803-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2b7a/11639655/e78bcfaae0a8/SIM-43-5803-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2b7a/11639655/1d3058a20162/SIM-43-5803-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2b7a/11639655/e78bcfaae0a8/SIM-43-5803-g001.jpg

相似文献

1
Improving Survey Inference Using Administrative Records Without Releasing Individual-Level Continuous Data.利用行政记录改进调查推断而不发布个体层面的连续数据。
Stat Med. 2024 Dec 30;43(30):5803-5813. doi: 10.1002/sim.10270. Epub 2024 Nov 18.
2
Embedded multilevel regression and poststratification: Model-based inference with incomplete auxiliary information.嵌入式多级回归和后分层:利用不完全辅助信息进行基于模型的推断。
Stat Med. 2024 Jan 30;43(2):256-278. doi: 10.1002/sim.9956. Epub 2023 Nov 15.
3
A flexible hierarchical framework for improving inference in area-referenced environmental health studies.用于提高面积参照环境健康研究推断的灵活分层框架。
Biom J. 2020 Nov;62(7):1650-1669. doi: 10.1002/bimj.201900241. Epub 2020 Jun 22.
4
Estimation of causal effects of multiple treatments in observational studies with a binary outcome.二元结局观察性研究中多种治疗因果效应的估计。
Stat Methods Med Res. 2020 Nov;29(11):3218-3234. doi: 10.1177/0962280220921909. Epub 2020 May 25.
5
Accounting for bias due to outcome data missing not at random: comparison and illustration of two approaches to probabilistic bias analysis: a simulation study.考虑由于非随机缺失结局数据导致的偏倚:两种概率性偏倚分析方法的比较和说明:一项模拟研究。
BMC Med Res Methodol. 2024 Nov 13;24(1):278. doi: 10.1186/s12874-024-02382-4.
6
Estimating generalized propensity scores with survey and attrition weighted data.利用调查和流失加权数据估计广义倾向得分。
Stat Med. 2024 May 20;43(11):2183-2202. doi: 10.1002/sim.10039. Epub 2024 Mar 26.
7
Folic acid supplementation and malaria susceptibility and severity among people taking antifolate antimalarial drugs in endemic areas.在流行地区,服用抗叶酸抗疟药物的人群中,叶酸补充剂与疟疾易感性和严重程度的关系。
Cochrane Database Syst Rev. 2022 Feb 1;2(2022):CD014217. doi: 10.1002/14651858.CD014217.
8
Inference from Nonrandom Samples Using Bayesian Machine Learning.使用贝叶斯机器学习从非随机样本进行推断。
J Surv Stat Methodol. 2022 Jan 20;11(2):433-455. doi: 10.1093/jssam/smab049. eCollection 2023 Apr.
9
Bayes computation for ecological inference.贝叶斯计算在生态推断中的应用。
Stat Med. 2011 May 30;30(12):1381-96. doi: 10.1002/sim.4214. Epub 2011 Feb 22.
10
Multiple bias calibration for valid statistical inference under nonignorable nonresponse.
Biometrics. 2025 Apr 2;81(2). doi: 10.1093/biomtc/ujaf044.

本文引用的文献

1
Inference from Nonrandom Samples Using Bayesian Machine Learning.使用贝叶斯机器学习从非随机样本进行推断。
J Surv Stat Methodol. 2022 Jan 20;11(2):433-455. doi: 10.1093/jssam/smab049. eCollection 2023 Apr.
2
Challenges in administrative data linkage for research.研究中行政数据链接的挑战。
Big Data Soc. 2017 Dec 5;4(2):2053951717745678. doi: 10.1177/2053951717745678.
3
Baseline prevalence of Axis I diagnosis in the Ohio Army National Guard.俄亥俄州国民警卫队轴 I 诊断的基线患病率。
Psychiatry Res. 2015 Mar 30;226(1):142-8. doi: 10.1016/j.psychres.2014.12.038. Epub 2015 Jan 6.
4
Potentially modifiable pre-, peri-, and postdeployment characteristics associated with deployment-related posttraumatic stress disorder among ohio army national guard soldiers.与俄亥俄州陆军国民警卫队士兵部署相关创伤后应激障碍相关的潜在可改变的部署前、部署中和部署后特征。
Ann Epidemiol. 2012 Feb;22(2):71-8. doi: 10.1016/j.annepidem.2011.11.003.
5
PTSD comorbidity and suicidal ideation associated with PTSD within the Ohio Army National Guard.俄亥俄陆军国民警卫队中 PTSD 共病与 PTSD 相关的自杀意念。
J Clin Psychiatry. 2011 Aug;72(8):1072-8. doi: 10.4088/JCP.11m06956.
6
Weight trimming and propensity score weighting.体重修剪和倾向评分加权。
PLoS One. 2011 Mar 31;6(3):e18174. doi: 10.1371/journal.pone.0018174.
7
Handling missing data in survey research.调查研究中的缺失数据处理
Stat Methods Med Res. 1996 Sep;5(3):215-38. doi: 10.1177/096228029600500302.