含噪声数据的因果推断：二元结局中同时处理缺失值和错误分类的偏倚分析与估计方法

Causal inference with noisy data: Bias analysis and estimation approaches to simultaneously addressing missingness and misclassification in binary outcomes.

作者信息

Shu Di, Yi Grace Y

机构信息

Department of Population Medicine, Harvard Medical School and Harvard Pilgrim Health Care Institute, Boston, Massachusetts.

Department of Statistics and Actuarial Science, University of Waterloo, Waterloo, Ontario, Canada.

出版信息

Stat Med. 2020 Feb 20;39(4):456-468. doi: 10.1002/sim.8419. Epub 2019 Dec 5.

DOI:10.1002/sim.8419

PMID:31802532

Abstract

Causal inference has been widely conducted in various fields and many methods have been proposed for different settings. However, for noisy data with both mismeasurements and missing observations, those methods often break down. In this paper, we consider a problem that binary outcomes are subject to both missingness and misclassification, when the interest is in estimation of the average treatment effects (ATE). We examine the asymptotic biases caused by ignoring missingness and/or misclassification and establish the intrinsic connections between missingness effects and misclassification effects on the estimation of ATE. We develop valid weighted estimation methods to simultaneously correct for missingness and misclassification effects. To provide protection against model misspecification, we further propose a doubly robust correction method which yields consistent estimators when either the treatment model or the outcome model is misspecified. Simulation studies are conducted to assess the performance of the proposed methods. An application to smoking cessation data is reported to illustrate the use of the proposed methods.

摘要

因果推断已在各个领域广泛开展，针对不同情况也提出了许多方法。然而，对于同时存在测量误差和缺失观测值的噪声数据，这些方法往往会失效。在本文中，我们考虑这样一个问题：当关注平均治疗效果（ATE）的估计时，二元结局同时存在缺失和误分类情况。我们研究了忽略缺失和/或误分类所导致的渐近偏差，并建立了缺失效应和误分类效应在ATE估计上的内在联系。我们开发了有效的加权估计方法，以同时校正缺失和误分类效应。为防范模型误设，我们进一步提出了一种双重稳健校正方法，当治疗模型或结局模型误设时，该方法能产生一致的估计量。进行了模拟研究以评估所提方法的性能。报告了一个戒烟数据的应用案例，以说明所提方法的使用情况。