• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

使用工具变量对非随机缺失数据进行半参数估计。

Semiparametric Estimation with Data Missing Not at Random Using an Instrumental Variable.

作者信息

Sun BaoLuo, Liu Lan, Miao Wang, Wirth Kathleen, Robins James, Tchetgen Tchetgen Eric J

机构信息

Department of Biostatistics, Harvard T.H. Chan School of Public Health.

Beijing International Center for Mathematical Research, Peking University.

出版信息

Stat Sin. 2018 Oct;28(4):1965-1983. doi: 10.5705/ss.202016.0324.

DOI:10.5705/ss.202016.0324
PMID:33335381
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7743916/
Abstract

Missing data occur frequently in empirical studies in health and social sciences, often compromising our ability to make accurate inferences. An outcome is said to be missing not at random (MNAR) if, conditional on the observed variables, the missing data mechanism still depends on the unobserved outcome. In such settings, identification is generally not possible without imposing additional assumptions. Identification is sometimes possible, however, if an instrumental variable (IV) is observed for all subjects which satisfies the exclusion restriction that the IV affects the missingness process without directly influencing the outcome. In this paper, we provide necessary and sufficient conditions for nonparametric identification of the full data distribution under MNAR with the aid of an IV. In addition, we give sufficient identification conditions that are more straightforward to verify in practice. For inference, we focus on estimation of a population outcome mean, for which we develop a suite of semiparametric estimators that extend methods previously developed for data missing at random. Specifically, we propose inverse probability weighted estimation, outcome regression-based estimation and doubly robust estimation of the mean of an outcome subject to MNAR. For illustration, the methods are used to account for selection bias induced by HIV testing refusal in the evaluation of HIV seroprevalence in Mochudi, Botswana, using interviewer characteristics such as gender, age and years of experience as IVs.

摘要

缺失数据在健康与社会科学的实证研究中频繁出现,常常影响我们做出准确推断的能力。如果在观测变量的条件下,缺失数据机制仍依赖于未观测到的结果,则称该结果为非随机缺失(MNAR)。在这种情况下,若不施加额外假设,通常无法进行识别。然而,如果为所有受试者观测到一个满足排除限制的工具变量(IV),即该IV影响缺失过程但不直接影响结果,那么有时是可以进行识别的。在本文中,我们给出了借助IV在MNAR情况下对完整数据分布进行非参数识别的充要条件。此外,我们还给出了在实践中更易于验证的充分识别条件。对于推断,我们专注于总体结果均值的估计,为此我们开发了一套半参数估计量,扩展了先前为随机缺失数据开发的方法。具体而言,我们提出了针对MNAR结果均值的逆概率加权估计、基于结果回归的估计和双重稳健估计。为作说明,这些方法被用于在博茨瓦纳莫丘迪评估艾滋病毒血清流行率时,利用诸如性别、年龄和工作年限等访员特征作为IV来处理因拒绝艾滋病毒检测导致的选择偏差。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/68f4/7743916/ef80e958a4ed/nihms-1623036-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/68f4/7743916/0a91ada9c550/nihms-1623036-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/68f4/7743916/ef80e958a4ed/nihms-1623036-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/68f4/7743916/0a91ada9c550/nihms-1623036-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/68f4/7743916/ef80e958a4ed/nihms-1623036-f0002.jpg

相似文献

1
Semiparametric Estimation with Data Missing Not at Random Using an Instrumental Variable.使用工具变量对非随机缺失数据进行半参数估计。
Stat Sin. 2018 Oct;28(4):1965-1983. doi: 10.5705/ss.202016.0324.
2
On varieties of doubly robust estimators under missingness not at random with a shadow variable.关于具有影子变量的非随机缺失情况下的双稳健估计量的各种形式。
Biometrika. 2016 Jun;103(2):475-482. doi: 10.1093/biomet/asw016. Epub 2016 May 10.
3
Discrete Choice Models for Nonmonotone Nonignorable Missing Data: Identification and Inference.非单调不可忽略缺失数据的离散选择模型:识别与推断
Stat Sin. 2018 Oct;28(4):2069-2088. doi: 10.5705/ss.202016.0325.
4
Semiparametric Inference for Nonmonotone Missing-Not-at-Random Data: The No Self-Censoring Model.非单调缺失非随机数据的半参数推断:无自删失模型
J Am Stat Assoc. 2022;117(539):1415-1423. doi: 10.1080/01621459.2020.1862669. Epub 2021 Feb 3.
5
Handling Missing Data in Instrumental Variable Methods for Causal Inference.因果推断工具变量法中的缺失数据处理
Annu Rev Stat Appl. 2019 Mar;6(1):125-148. doi: 10.1146/annurev-statistics-031017-100353. Epub 2018 Nov 28.
6
Folic acid supplementation and malaria susceptibility and severity among people taking antifolate antimalarial drugs in endemic areas.在流行地区,服用抗叶酸抗疟药物的人群中,叶酸补充剂与疟疾易感性和严重程度的关系。
Cochrane Database Syst Rev. 2022 Feb 1;2(2022):CD014217. doi: 10.1002/14651858.CD014217.
7
Doubly robust inference for targeted minimum loss-based estimation in randomized trials with missing outcome data.在存在结局数据缺失的随机试验中,基于目标最小损失估计的双重稳健推断。
Stat Med. 2017 Oct 30;36(24):3807-3819. doi: 10.1002/sim.7389. Epub 2017 Jul 25.
8
Identification and inference with nonignorable missing covariate data.具有不可忽略缺失协变量数据的识别与推断。
Stat Sin. 2018 Oct;28(4):2049-2067. doi: 10.5705/ss.202016.0322.
9
Instrumental variables and inverse probability weighting for causal inference from longitudinal observational studies.纵向观察性研究因果推断的工具变量与逆概率加权法
Stat Methods Med Res. 2004 Feb;13(1):17-48. doi: 10.1191/0962280204sm351ra.
10
A general instrumental variable framework for regression analysis with outcome missing not at random.一种用于结果非随机缺失的回归分析的通用工具变量框架。
Biometrics. 2017 Dec;73(4):1123-1131. doi: 10.1111/biom.12670. Epub 2017 Feb 23.

引用本文的文献

1
Obtaining personalized predictions from a randomized controlled trial on Alzheimer's disease.从一项关于阿尔茨海默病的随机对照试验中获得个性化预测。
Sci Rep. 2025 Jan 11;15(1):1671. doi: 10.1038/s41598-024-84687-4.
2
Double Sampling for Informatively Missing Data in Electronic Health Record-Based Comparative Effectiveness Research.基于电子健康记录的比较效果研究中信息性缺失数据的双重抽样
Stat Med. 2024 Dec 30;43(30):6086-6098. doi: 10.1002/sim.10298. Epub 2024 Dec 5.
3
Testing the missing at random assumption in generalized linear models in the presence of instrumental variables.

本文引用的文献

1
A general instrumental variable framework for regression analysis with outcome missing not at random.一种用于结果非随机缺失的回归分析的通用工具变量框架。
Biometrics. 2017 Dec;73(4):1123-1131. doi: 10.1111/biom.12670. Epub 2017 Feb 23.
2
Estimation of regression models for the mean of repeated outcomes under nonignorable nonmonotone nonresponse.在不可忽略的非单调无应答情况下重复测量结果均值回归模型的估计。
Biometrika. 2007 Dec;94(4):841-860. doi: 10.1093/biomet/asm070.
3
Sensitivity analysis of incomplete longitudinal data departing from the missing at random assumption: Methodology and application in a clinical trial with drop-outs.
在存在工具变量的情况下检验广义线性模型中的随机缺失假设。
Scand Stat Theory Appl. 2024 Mar;51(1):334-354. doi: 10.1111/sjos.12685. Epub 2023 Aug 7.
4
Envelope method with ignorable missing data.带有可忽略缺失数据的包络法。
Electron J Stat. 2021;15(2):4420-4461. doi: 10.1214/21-ejs1881. Epub 2021 Sep 14.
5
Non-parametric inference about mean functionals of non-ignorable non-response data without identifying the joint distribution.在不识别联合分布的情况下,对不可忽略的非应答数据的均值泛函进行非参数推断。
J R Stat Soc Series B Stat Methodol. 2023 May 8;85(3):913-935. doi: 10.1093/jrsssb/qkad047. eCollection 2023 Jul.
6
Evaluation of machine learning methods for covariate data imputation in pharmacometrics.评价机器学习方法在药物计量学中协变量数据插补中的应用。
CPT Pharmacometrics Syst Pharmacol. 2022 Dec;11(12):1638-1648. doi: 10.1002/psp4.12874. Epub 2022 Nov 8.
7
Semiparametric Inference for Nonmonotone Missing-Not-at-Random Data: The No Self-Censoring Model.非单调缺失非随机数据的半参数推断:无自删失模型
J Am Stat Assoc. 2022;117(539):1415-1423. doi: 10.1080/01621459.2020.1862669. Epub 2021 Feb 3.
8
A Nuisance-Free Inference Procedure Accounting for the Unknown Missingness with Application to Electronic Health Records.一种考虑未知缺失情况的无干扰推断程序及其在电子健康记录中的应用
Entropy (Basel). 2020 Oct 14;22(10):1154. doi: 10.3390/e22101154.
9
Implementation of Instrumental Variable Bounds for Data Missing Not at Random.非随机缺失数据的工具变量边界的实现。
Epidemiology. 2018 May;29(3):364-368. doi: 10.1097/EDE.0000000000000811.
违背随机缺失假设的不完全纵向数据的敏感性分析:方法及在有失访情况的临床试验中的应用
Stat Methods Med Res. 2016 Aug;25(4):1471-89. doi: 10.1177/0962280213490014. Epub 2013 May 22.
4
On doubly robust estimation in a semiparametric odds ratio model.半参数优势比模型中的双重稳健估计
Biometrika. 2010 Mar;97(1):171-180. doi: 10.1093/biomet/asp062. Epub 2009 Dec 8.
5
On weighting approaches for missing data.关于缺失数据的加权方法。
Stat Methods Med Res. 2013 Feb;22(1):14-30. doi: 10.1177/0962280211403597. Epub 2011 Jun 24.
6
A simple implementation of doubly robust estimation in logistic regression with covariates missing at random.在逻辑回归中对随机缺失协变量进行双重稳健估计的一种简单实现方法。
Epidemiology. 2009 May;20(3):391-4. doi: 10.1097/EDE.0b013e3181a0acc7.
7
A semiparametric odds ratio model for measuring association.一种用于测量关联性的半参数优势比模型。
Biometrics. 2007 Jun;63(2):413-21. doi: 10.1111/j.1541-0420.2006.00701.x.
8
Sensitivity analysis after multiple imputation under missing at random: a weighting approach.随机缺失情况下多重填补后的敏感性分析:一种加权方法。
Stat Methods Med Res. 2007 Jun;16(3):259-75. doi: 10.1177/0962280206075303.
9
Multiple imputation: current perspectives.多重填补:当前观点
Stat Methods Med Res. 2007 Jun;16(3):199-218. doi: 10.1177/0962280206075304.
10
Can one assess whether missing data are missing at random in medical studies?在医学研究中,人们能否评估缺失数据是否为随机缺失?
Stat Methods Med Res. 2006 Jun;15(3):213-34. doi: 10.1191/0962280206sm448oa.