Volfovsky Alexander, Hoff Peter D
Harvard University and University of Washington.
Ann Appl Stat. 2014 Mar 1;8(1):19-47. doi: 10.1214/13-AOAS685.
ANOVA decompositions are a standard method for describing and estimating heterogeneity among the means of a response variable across levels of multiple categorical factors. In such a decomposition, the complete set of main effects and interaction terms can be viewed as a collection of vectors, matrices and arrays that share various index sets defined by the factor levels. For many types of categorical factors, it is plausible that an ANOVA decomposition exhibits some consistency across orders of effects, in that the levels of a factor that have similar main-effect coefficients may also have similar coefficients in higher-order interaction terms. In such a case, estimation of the higher-order interactions should be improved by borrowing information from the main effects and lower-order interactions. To take advantage of such patterns, this article introduces a class of hierarchical prior distributions for collections of interaction arrays that can adapt to the presence of such interactions. These prior distributions are based on a type of array-variate normal distribution, for which a covariance matrix for each factor is estimated. This prior is able to adapt to potential similarities among the levels of a factor, and incorporate any such information into the estimation of the effects in which the factor appears. In the presence of such similarities, this prior is able to borrow information from well-estimated main effects and lower-order interactions to assist in the estimation of higher-order terms for which data information is limited.
方差分析分解是一种用于描述和估计响应变量在多个分类因素不同水平下均值之间异质性的标准方法。在这样的分解中,主效应和交互项的完整集合可以看作是由因子水平定义的共享各种索引集的向量、矩阵和数组的集合。对于许多类型的分类因素,方差分析分解在效应顺序上表现出某种一致性是合理的,即具有相似主效应系数的因子水平在高阶交互项中可能也具有相似的系数。在这种情况下,通过从主效应和低阶交互项中借用信息,应该能够改进高阶交互项的估计。为了利用这种模式,本文针对交互数组的集合引入了一类分层先验分布,该分布能够适应此类交互的存在。这些先验分布基于一种数组变量正态分布,针对每个因子估计一个协方差矩阵。这种先验能够适应因子水平之间的潜在相似性,并将任何此类信息纳入因子出现的效应估计中。在存在此类相似性的情况下,这种先验能够从估计良好的主效应和低阶交互项中借用信息,以协助估计数据信息有限的高阶项。