Zhu Hongxiao, Brown Philip J, Morris Jeffrey S
Statistical and Applied Mathematical Sciences Institute, RTP, NC.
J Am Stat Assoc. 2011 Sep 1;106(495):1167-1179. doi: 10.1198/jasa.2011.tm10370.
Functional data are increasingly encountered in scientific studies, and their high dimensionality and complexity lead to many analytical challenges. Various methods for functional data analysis have been developed, including functional response regression methods that involve regression of a functional response on univariate/multivariate predictors with nonparametrically represented functional coefficients. In existing methods, however, the functional regression can be sensitive to outlying curves and outlying regions of curves, so is not robust. In this paper, we introduce a new Bayesian method, robust functional mixed models (R-FMM), for performing robust functional regression within the general functional mixed model framework, which includes multiple continuous or categorical predictors and random effect functions accommodating potential between-function correlation induced by the experimental design. The underlying model involves a hierarchical scale mixture model for the fixed effects, random effect and residual error functions. These modeling assumptions across curves result in robust nonparametric estimators of the fixed and random effect functions which down-weight outlying curves and regions of curves, and produce statistics that can be used to flag global and local outliers. These assumptions also lead to distributions across wavelet coefficients that have outstanding sparsity and adaptive shrinkage properties, with great flexibility for the data to determine the sparsity and the heaviness of the tails. Together with the down-weighting of outliers, these within-curve properties lead to fixed and random effect function estimates that appear in our simulations to be remarkably adaptive in their ability to remove spurious features yet retain true features of the functions. We have developed general code to implement this fully Bayesian method that is automatic, requiring the user to only provide the functional data and design matrices. It is efficient enough to handle large data sets, and yields posterior samples of all model parameters that can be used to perform desired Bayesian estimation and inference. Although we present details for a specific implementation of the R-FMM using specific distributional choices in the hierarchical model, 1D functions, and wavelet transforms, the method can be applied more generally using other heavy-tailed distributions, higher dimensional functions (e.g. images), and using other invertible transformations as alternatives to wavelets.
在科学研究中,功能数据越来越常见,其高维度和复杂性带来了诸多分析挑战。已开发出各种功能数据分析方法,包括功能响应回归方法,该方法涉及对具有非参数表示的功能系数的单变量/多变量预测变量进行功能响应回归。然而,在现有方法中,功能回归可能对曲线的异常值曲线和曲线的异常值区域敏感,因此不够稳健。在本文中,我们引入了一种新的贝叶斯方法——稳健功能混合模型(R-FMM),用于在一般功能混合模型框架内进行稳健功能回归,该框架包括多个连续或分类预测变量以及适应实验设计引起的潜在功能间相关性的随机效应函数。基础模型涉及固定效应、随机效应和残差误差函数的分层尺度混合模型。这些跨曲线的建模假设产生了固定效应和随机效应函数的稳健非参数估计量,这些估计量会降低异常值曲线和曲线区域的权重,并生成可用于标记全局和局部异常值的统计量。这些假设还导致小波系数的分布具有出色的稀疏性和自适应收缩特性,对于数据确定稀疏性和尾部的厚重程度具有很大的灵活性。与异常值的权重降低一起,这些曲线内特性导致固定效应和随机效应函数估计量在我们的模拟中表现出显著的适应性,能够去除虚假特征同时保留函数的真实特征。我们已经开发了通用代码来实现这种完全贝叶斯方法,该方法是自动的,只要求用户提供功能数据和设计矩阵。它足够高效以处理大型数据集,并产生所有模型参数的后验样本,可用于执行所需的贝叶斯估计和推断。尽管我们详细介绍了使用分层模型中的特定分布选择、一维函数和小波变换对R-FMM进行特定实现的细节,但该方法可以更广泛地应用于使用其他重尾分布、更高维函数(如图像)以及使用其他可逆变换作为小波替代的情况。