Ma Yanyuan, Li Runze
Department of Statistics, Texas A&M University, College Station, TX 77843.
Bernoulli (Andover). 2010;16(1):274-300. doi: 10.3150/09-bej205.
Measurement error data or errors-in-variable data are often collected in many studies. Natural criterion functions are often unavailable for general functional measurement error models due to the lack of information on the distribution of the unobservable covariates. Typically, the parameter estimation is via solving estimating equations. In addition, the construction of such estimating equations routinely requires solving integral equations, hence the computation is often much more intensive compared with ordinary regression models. Because of these difficulties, traditional best subset variable selection procedures are not applicable, and in the measurement error model context, variable selection remains an unsolved issue. In this paper, we develop a framework for variable selection in measurement error models via penalized estimating equations. We first propose a class of selection procedures for general parametric measurement error models and for general semiparametric measurement error models, and study the asymptotic properties of the proposed procedures. Then, under certain regularity conditions and with a properly chosen regularization parameter, we demonstrate that the proposed procedure performs as well as an oracle procedure. We assess the finite sample performance via Monte Carlo simulation studies and illustrate the proposed methodology through the empirical analysis of a familiar data set.
在许多研究中经常会收集测量误差数据或变量含误差数据。由于缺乏关于不可观测协变量分布的信息,对于一般的函数测量误差模型,通常无法获得自然准则函数。通常,参数估计是通过求解估计方程来进行的。此外,构建此类估计方程通常需要求解积分方程,因此与普通回归模型相比,计算量往往要大得多。由于这些困难,传统的最佳子集变量选择程序并不适用,并且在测量误差模型的背景下,变量选择仍然是一个未解决的问题。在本文中,我们通过惩罚估计方程开发了一个测量误差模型中的变量选择框架。我们首先为一般参数测量误差模型和一般半参数测量误差模型提出了一类选择程序,并研究了所提出程序的渐近性质。然后,在某些正则性条件下并通过适当选择正则化参数,我们证明所提出的程序与一种理想程序具有相同的性能。我们通过蒙特卡罗模拟研究评估有限样本性能,并通过对一个熟悉数据集的实证分析来说明所提出的方法。