Dai Xiaowu
Department of Statistics and Data Science, and Biostatistics, University of California, Los Angeles, CA 90095, USA.
J R Stat Soc Series B Stat Methodol. 2024 Sep 11;87(2):319-336. doi: 10.1093/jrsssb/qkae093. eCollection 2025 Apr.
Traditional nonparametric estimation methods often lead to a slow convergence rate in large dimensions and require unrealistically large dataset sizes for reliable conclusions. We develop an approach based on partial derivatives, either observed or estimated, to effectively estimate the function at near-parametric convergence rates. This novel approach and computational algorithm could lead to methods useful to practitioners in many areas of science and engineering. Our theoretical results reveal behaviour universal to this class of nonparametric estimation problems. We explore a general setting involving tensor product spaces and build upon the smoothing spline analysis of variance framework. For -dimensional models under full interaction, the optimal rates with gradient information on covariates are identical to those for the -interaction models without gradients and, therefore, the models are immune to the . For additive models, the optimal rates using gradient information are , thus achieving the . We demonstrate aspects of the theoretical results through synthetic and real data applications.
传统的非参数估计方法在高维情况下通常收敛速度较慢,并且需要大得不太现实的数据集规模才能得出可靠的结论。我们开发了一种基于观测或估计的偏导数的方法,以近参数收敛速度有效地估计函数。这种新颖的方法和计算算法可能会产生对许多科学和工程领域的从业者有用的方法。我们的理论结果揭示了这类非参数估计问题的普遍行为。我们探索了一个涉及张量积空间的一般设置,并基于方差分析的平滑样条框架进行构建。对于全交互下的(d)维模型,具有协变量梯度信息时的最优收敛速度与无梯度的(d)交互模型相同,因此,这些模型不受(\cdots)影响。对于加性模型,使用梯度信息的最优收敛速度为(\cdots),从而达到(\cdots)。我们通过合成数据和实际数据应用展示了理论结果的各个方面。