Carone Marco, Luedtke Alexander R, van der Laan Mark J
Department of Biostatistics, University of Washington.
Vaccine and Infectious Disease Division, Fred Hutchinson Cancer Research Center.
J Am Stat Assoc. 2019;114(527):1174-1190. doi: 10.1080/01621459.2018.1482752. Epub 2018 Sep 13.
Despite the risk of misspecification they are tied to, parametric models continue to be used in statistical practice because they are simple and convenient to use. In particular, efficient estimation procedures in parametric models are easy to describe and implement. Unfortunately, the same cannot be said of semiparametric and nonparametric models. While the latter often reflect the level of available scientific knowledge more appropriately, performing efficient inference in these models is generally challenging. The efficient influence function is a key analytic object from which the construction of asymptotically efficient estimators can potentially be streamlined. However, the theoretical derivation of the efficient influence function requires specialized knowledge and is often a difficult task, even for experts. In this paper, we present a novel representation of the efficient influence function and describe a numerical procedure for approximating its evaluation. The approach generalizes the nonparametric procedures of Frangakis et al. (2015) and Luedtke et al. (2015) to arbitrary models. We present theoretical results to support our proposal, and illustrate the method in the context of several semiparametric problems. The proposed approach is an important step toward automating efficient estimation in general statistical models, thereby rendering more accessible the use of realistic models in statistical analyses.
尽管参数模型存在设定错误的风险,但它们在统计实践中仍被继续使用,因为它们简单且便于使用。特别是,参数模型中的有效估计程序易于描述和实施。不幸的是,半参数模型和非参数模型并非如此。虽然后者通常能更恰当地反映现有科学知识的水平,但在这些模型中进行有效推断通常具有挑战性。有效影响函数是一个关键的分析对象,从中有可能简化渐近有效估计量的构建。然而,有效影响函数的理论推导需要专业知识,而且即使对于专家来说通常也是一项艰巨的任务。在本文中,我们提出了有效影响函数的一种新颖表示形式,并描述了一种近似其求值的数值程序。该方法将Frangakis等人(2015年)和Luedtke等人(2015年)的非参数程序进行了推广。我们给出了理论结果来支持我们的提议,并在几个半参数问题的背景下说明了该方法。所提出的方法是朝着在一般统计模型中实现有效估计自动化迈出的重要一步,从而使在统计分析中使用现实模型变得更容易。