Westling Ted, Luedtke Alex, Gilbert Peter B, Carone Marco
Department of Mathematics and Statistics, University of Massachusetts Amherst.
Department of Statistics, University of Washington.
J Am Stat Assoc. 2024;119(546):1541-1553. doi: 10.1080/01621459.2023.2205060. Epub 2023 Jun 5.
In the absence of data from a randomized trial, researchers may aim to use observational data to draw causal inference about the effect of a treatment on a time-to-event outcome. In this context, interest often focuses on the treatment-specific survival curves, that is, the survival curves were the population under study to be assigned to receive the treatment or not. Under certain conditions, including that all confounders of the treatment-outcome relationship are observed, the treatment-specific survival curve can be identified with a covariate-adjusted survival curve. In this article, we propose a novel cross-fitted doubly-robust estimator that incorporates data-adaptive (e.g. machine learning) estimators of the conditional survival functions. We establish conditions on the nuisance estimators under which our estimator is consistent and asymptotically linear, both pointwise and uniformly in time. We also propose a novel ensemble learner for combining multiple candidate estimators of the conditional survival estimators. Notably, our methods and results accommodate events occurring in discrete or continuous time, or an arbitrary mix of the two. We investigate the practical performance of our methods using numerical studies and an application to the effect of a surgical treatment to prevent metastases of parotid carcinoma on mortality.
在缺乏随机试验数据的情况下,研究人员可能旨在利用观察性数据对治疗对事件发生时间结局的影响进行因果推断。在此背景下,关注点通常集中在特定治疗的生存曲线,即研究总体被分配接受或不接受该治疗时的生存曲线。在某些条件下,包括观察到治疗与结局关系的所有混杂因素,特定治疗的生存曲线可以用协变量调整后的生存曲线来识别。在本文中,我们提出了一种新颖的交叉拟合双稳健估计器,它结合了条件生存函数的数据自适应(如机器学习)估计器。我们建立了关于干扰估计器的条件,在此条件下我们的估计器在时间上逐点和一致地是一致的且渐近线性的。我们还提出了一种新颖的集成学习器,用于组合条件生存估计器的多个候选估计器。值得注意的是,我们的方法和结果适用于离散或连续时间发生的事件,或两者的任意混合。我们通过数值研究以及将其应用于一种预防腮腺癌转移的手术治疗对死亡率的影响,来探究我们方法的实际性能。