Carroll Raymond J, Chen Xiaohong, Hu Yingyao
Department of Statistics, Texas A&M University,
J Nonparametr Stat. 2010 May 1;22(4):379-399. doi: 10.1080/10485250902874688.
This paper considers identification and estimation of a general nonlinear Errors-in-Variables (EIV) model using two samples. Both samples consist of a dependent variable, some error-free covariates, and an error-prone covariate, for which the measurement error has unknown distribution and could be arbitrarily correlated with the latent true values; and neither sample contains an accurate measurement of the corresponding true variable. We assume that the regression model of interest - the conditional distribution of the dependent variable given the latent true covariate and the error-free covariates - is the same in both samples, but the distributions of the latent true covariates vary with observed error-free discrete covariates. We first show that the general latent nonlinear model is nonparametrically identified using the two samples when both could have nonclassical errors, without either instrumental variables or independence between the two samples. When the two samples are independent and the nonlinear regression model is parameterized, we propose sieve Quasi Maximum Likelihood Estimation (Q-MLE) for the parameter of interest, and establish its root-n consistency and asymptotic normality under possible misspecification, and its semiparametric efficiency under correct specification, with easily estimated standard errors. A Monte Carlo simulation and a data application are presented to show the power of the approach.
本文考虑使用两个样本对一般非线性变量误差(EIV)模型进行识别和估计。两个样本均包含一个因变量、一些无误差协变量以及一个有误差协变量,其中测量误差具有未知分布且可能与潜在真值任意相关;并且两个样本均不包含相应真实变量的准确测量值。我们假设感兴趣的回归模型——给定潜在真实协变量和无误差协变量时因变量的条件分布——在两个样本中是相同的,但潜在真实协变量的分布随观测到的无误差离散协变量而变化。我们首先表明,当两个样本都可能存在非经典误差时,在既无工具变量又无两个样本之间独立性的情况下,使用这两个样本可以非参数地识别一般潜在非线性模型。当两个样本独立且非线性回归模型参数化时,我们针对感兴趣的参数提出筛准最大似然估计(Q-MLE),并在可能的模型误设下建立其根n一致性和渐近正态性,在正确设定下建立其半参数效率,同时标准误差易于估计。给出了蒙特卡罗模拟和数据应用以展示该方法的功效。