Randolph Timothy W, Harezlak Jaroslaw, Feng Ziding
Fred Hutchinson Cancer Research Center, Biostatistics and Biomathematics Program, Seattle, WA 98109.
Electron J Stat. 2012 Jan 1;6:323-353. doi: 10.1214/12-EJS676.
One of the challenges with functional data is incorporating geometric structure, or local correlation, into the analysis. This structure is inherent in the output from an increasing number of biomedical technologies, and a functional linear model is often used to estimate the relationship between the predictor functions and scalar responses. Common approaches to the problem of estimating a coefficient function typically involve two stages: regularization and estimation. Regularization is usually done via dimension reduction, projecting onto a predefined span of basis functions or a reduced set of eigenvectors (principal components). In contrast, we present a unified approach that directly incorporates geometric structure into the estimation process by exploiting the joint eigenproperties of the predictors and a linear penalty operator. In this sense, the components in the regression are 'partially empirical' and the framework is provided by the generalized singular value decomposition (GSVD). The form of the penalized estimation is not new, but the GSVD clarifies the process and informs the choice of penalty by making explicit the joint influence of the penalty and predictors on the bias, variance and performance of the estimated coefficient function. Laboratory spectroscopy data and simulations are used to illustrate the concepts.
功能数据面临的挑战之一是将几何结构或局部相关性纳入分析。这种结构在越来越多的生物医学技术的输出中是固有的,并且功能线性模型通常用于估计预测函数与标量响应之间的关系。估计系数函数问题的常见方法通常涉及两个阶段:正则化和估计。正则化通常通过降维来完成,投影到预定义的基函数跨度或一组简化的特征向量(主成分)上。相比之下,我们提出了一种统一的方法,通过利用预测变量和线性惩罚算子的联合特征属性,将几何结构直接纳入估计过程。从这个意义上说,回归中的成分是“部分经验性的”,并且该框架由广义奇异值分解(GSVD)提供。惩罚估计的形式并不新颖,但GSVD通过明确惩罚和预测变量对估计系数函数的偏差、方差和性能的联合影响,阐明了过程并为惩罚的选择提供了依据。实验室光谱数据和模拟用于说明这些概念。