• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

缺失数据图形模型中的完全定律识别:完备性结果

Full Law Identification in Graphical Models of Missing Data: Completeness Results.

作者信息

Nabi Razieh, Bhattacharya Rohit, Shpitser Ilya

机构信息

Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA.

出版信息

Proc Mach Learn Res. 2020 Jul;119:7153-7163.

PMID:33283197
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7716645/
Abstract

Missing data has the potential to affect analyses conducted in all fields of scientific study including healthcare, economics, and the social sciences. Several approaches to unbiased inference in the presence of non-ignorable missingness rely on the specification of the target distribution and its missingness process as a probability distribution that factorizes with respect to a directed acyclic graph. In this paper, we address the longstanding question of the characterization of models that are identifiable within this class of missing data distributions. We provide the first completeness result in this field of study - necessary and sufficient graphical conditions under which, the full data distribution can be recovered from the observed data distribution. We then simultaneously address issues that may arise due to the presence of both missing data and unmeasured confounding, by extending these graphical conditions and proofs of completeness, to settings where some variables are not just missing, but completely unobserved.

摘要

缺失数据有可能影响包括医疗保健、经济学和社会科学在内的所有科学研究领域所进行的分析。在存在不可忽略的缺失性的情况下,几种无偏推断方法依赖于将目标分布及其缺失过程指定为相对于有向无环图可分解的概率分布。在本文中,我们解决了在这类缺失数据分布中可识别模型的表征这一长期存在的问题。我们在该研究领域给出了首个完备性结果——充分必要的图形条件,在这些条件下,可以从观测数据分布中恢复完整数据分布。然后,我们通过将这些图形条件和完备性证明扩展到某些变量不仅缺失而且完全未被观测到的情形,同时解决由于存在缺失数据和未测量的混杂因素可能出现的问题。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4ae0/7716645/65aa94d619e9/nihms-1649296-f0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4ae0/7716645/5f99e22f218c/nihms-1649296-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4ae0/7716645/378734227292/nihms-1649296-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4ae0/7716645/65aa94d619e9/nihms-1649296-f0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4ae0/7716645/5f99e22f218c/nihms-1649296-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4ae0/7716645/378734227292/nihms-1649296-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4ae0/7716645/65aa94d619e9/nihms-1649296-f0003.jpg

相似文献

1
Full Law Identification in Graphical Models of Missing Data: Completeness Results.缺失数据图形模型中的完全定律识别:完备性结果
Proc Mach Learn Res. 2020 Jul;119:7153-7163.
2
Identification In Missing Data Models Represented By Directed Acyclic Graphs.有向无环图表示的缺失数据模型中的识别
Uncertain Artif Intell. 2019 Jul;2019.
3
Folic acid supplementation and malaria susceptibility and severity among people taking antifolate antimalarial drugs in endemic areas.在流行地区,服用抗叶酸抗疟药物的人群中,叶酸补充剂与疟疾易感性和严重程度的关系。
Cochrane Database Syst Rev. 2022 Feb 1;2(2022):CD014217. doi: 10.1002/14651858.CD014217.
4
Multiple imputation with missing indicators as proxies for unmeasured variables: simulation study.缺失指标的多重插补作为未测量变量的代理:模拟研究。
BMC Med Res Methodol. 2020 Jul 8;20(1):185. doi: 10.1186/s12874-020-01068-x.
5
Semiparametric Estimation with Data Missing Not at Random Using an Instrumental Variable.使用工具变量对非随机缺失数据进行半参数估计。
Stat Sin. 2018 Oct;28(4):1965-1983. doi: 10.5705/ss.202016.0324.
6
Ignoring Non-ignorable Missingness.忽略不可忽略的缺失值。
Psychometrika. 2023 Mar;88(1):31-50. doi: 10.1007/s11336-022-09895-1. Epub 2022 Dec 20.
7
Statistical Modeling of Longitudinal Data with Non-ignorable Non-monotone Missingness with Semiparametric Bayesian and Machine Learning Components.具有半参数贝叶斯和机器学习组件的非可忽略非单调缺失纵向数据的统计建模
Sankhya B (2008). 2021 May;83(1):152-169. doi: 10.1007/s13571-019-00222-w. Epub 2020 Mar 9.
8
Analysis of Missingness Scenarios for Observational Health Data.观察性健康数据的缺失情况分析
J Pers Med. 2024 May 11;14(5):514. doi: 10.3390/jpm14050514.
9
Discrete Choice Models for Nonmonotone Nonignorable Missing Data: Identification and Inference.非单调不可忽略缺失数据的离散选择模型:识别与推断
Stat Sin. 2018 Oct;28(4):2069-2088. doi: 10.5705/ss.202016.0325.
10
Recoverability and estimation of causal effects under typical multivariable missingness mechanisms.典型多变量缺失机制下因果效应的可恢复性和估计。
Biom J. 2024 Apr;66(3):e2200326. doi: 10.1002/bimj.202200326.

引用本文的文献

1
Pitfalls of imputing using incomplete auxiliary variables.使用不完整辅助变量进行插补的陷阱。
Am J Epidemiol. 2025 Jun 3;194(6):1801-1802. doi: 10.1093/aje/kwaf043.
2
Computational tools and data integration to accelerate vaccine development: challenges, opportunities, and future directions.加速疫苗开发的计算工具与数据整合:挑战、机遇及未来方向
Front Immunol. 2025 Mar 7;16:1502484. doi: 10.3389/fimmu.2025.1502484. eCollection 2025.
3
Causal Inference With Outcome-Dependent Missingness And Self-Censoring.存在结果依赖型缺失和自我删失情况下的因果推断
Proc Mach Learn Res. 2023 Aug;216:358-368.
4
Mathur and Shpitser respond to "The evolution of selection bias in the recent epidemiologic literature-a selective overview".马图尔和施皮策回应了《近期流行病学文献中选择偏倚的演变——选择性综述》。
Am J Epidemiol. 2025 Mar 4;194(3):585-586. doi: 10.1093/aje/kwae287.
5
Analysis of Missingness Scenarios for Observational Health Data.观察性健康数据的缺失情况分析
J Pers Med. 2024 May 11;14(5):514. doi: 10.3390/jpm14050514.
6
Leveraging Structured Biological Knowledge for Counterfactual Inference: A Case Study of Viral Pathogenesis.利用结构化生物学知识进行反事实推理:病毒致病机制的案例研究
IEEE Trans Big Data. 2021 Jan 18;7(1):25-37. doi: 10.1109/TBDATA.2021.3050680. eCollection 2021 Mar 1.
7
Conditional generation of medical time series for extrapolation to underrepresented populations.用于外推至代表性不足人群的医学时间序列的条件生成。
PLOS Digit Health. 2022 Jul 19;1(7):e0000074. doi: 10.1371/journal.pdig.0000074. eCollection 2022 Jul.
8
A Robust Functional EM Algorithm for Incomplete Panel Count Data.一种用于不完全分组计数数据的稳健功能期望最大化算法。
Adv Neural Inf Process Syst. 2020 Dec;33:19828-19838.

本文引用的文献

1
Semiparametric Inference for Nonmonotone Missing-Not-at-Random Data: The No Self-Censoring Model.非单调缺失非随机数据的半参数推断:无自删失模型
J Am Stat Assoc. 2022;117(539):1415-1423. doi: 10.1080/01621459.2020.1862669. Epub 2021 Feb 3.
2
Discrete Choice Models for Nonmonotone Nonignorable Missing Data: Identification and Inference.非单调不可忽略缺失数据的离散选择模型:识别与推断
Stat Sin. 2018 Oct;28(4):2069-2088. doi: 10.5705/ss.202016.0325.
3
Identification In Missing Data Models Represented By Directed Acyclic Graphs.有向无环图表示的缺失数据模型中的识别
Uncertain Artif Intell. 2019 Jul;2019.
4
Fast Causal Inference with Non-Random Missingness by Test-Wise Deletion.通过逐测试删除法对非随机缺失值进行快速因果推断
Int J Data Sci Anal. 2018 Aug;6(1):47-62. doi: 10.1007/s41060-017-0094-6. Epub 2018 Jan 19.
5
Structure Learning Under Missing Data.缺失数据下的结构学习
Proc Mach Learn Res. 2018 Sep;72:121-132.
6
A Hybrid Causal Search Algorithm for Latent Variable Models.一种用于潜在变量模型的混合因果搜索算法。
JMLR Workshop Conf Proc. 2016 Aug;52:368-379.
7
Estimation of regression models for the mean of repeated outcomes under nonignorable nonmonotone nonresponse.在不可忽略的非单调无应答情况下重复测量结果均值回归模型的估计。
Biometrika. 2007 Dec;94(4):841-860. doi: 10.1093/biomet/asm070.
8
Using causal diagrams to guide analysis in missing data problems.使用因果图指导缺失数据问题的分析。
Stat Methods Med Res. 2012 Jun;21(3):243-56. doi: 10.1177/0962280210394469. Epub 2011 Mar 9.
9
Issues in multiple imputation of missing data for large general practice clinical databases.大型全科临床数据库缺失数据多重插补中的问题。
Pharmacoepidemiol Drug Saf. 2010 Jun;19(6):618-26. doi: 10.1002/pds.1934.
10
Simple imputation methods were inadequate for missing not at random (MNAR) quality of life data.简单的插补方法对于非随机缺失(MNAR)的生活质量数据是不够的。
Health Qual Life Outcomes. 2008 Aug 4;6:57. doi: 10.1186/1477-7525-6-57.