Garcia Nancy L, Rodrigues-Motta Mariana, Migon Helio S, Petkova Eva, Tarpey Thaddeus, Ogden R Todd, Giordano Julio O, Perez Martin M
Department of Statistics, Universidade Estadual de Campinas, Campinas, Brazil.
Department of Statistics, Federal University of Rio de Janeiro, Rio de Janeiro, Brazil.
J R Stat Soc Ser C Appl Stat. 2024 Feb 7;73(3):658-681. doi: 10.1093/jrsssc/qlae006. eCollection 2024 Jun.
We consider unsupervised classification by means of a latent multinomial variable which categorizes a scalar response into one of the L components of a mixture model which incorporates scalar and functional covariates. This process can be thought as a hierarchical model with the first level modelling a scalar response according to a mixture of parametric distributions and the second level modelling the mixture probabilities by means of a generalized linear model with functional and scalar covariates. The traditional approach of treating functional covariates as vectors not only suffers from the curse of dimensionality, since functional covariates can be measured at very small intervals leading to a highly parametrized model, but also does not take into account the nature of the data. We use basis expansions to reduce the dimensionality and a Bayesian approach for estimating the parameters while providing predictions of the latent classification vector. The method is motivated by two data examples that are not easily handled by existing methods. The first example concerns identifying placebo responders on a clinical trial (normal mixture model) and the other predicting illness for milking cows (zero-inflated mixture of the Poisson model).
我们考虑通过一个潜在的多项变量进行无监督分类,该变量将标量响应分类为混合模型的L个分量之一,该混合模型包含标量和函数协变量。这个过程可以被视为一个层次模型,第一级根据参数分布的混合对标量响应进行建模,第二级通过具有函数和标量协变量的广义线性模型对混合概率进行建模。将函数协变量视为向量的传统方法不仅受到维度诅咒的影响,因为函数协变量可以在非常小的间隔内进行测量,从而导致一个高度参数化的模型,而且没有考虑数据的性质。我们使用基展开来降低维度,并采用贝叶斯方法来估计参数,同时提供潜在分类向量的预测。该方法由两个现有方法难以处理的数据示例所推动。第一个示例涉及在临床试验中识别安慰剂反应者(正态混合模型),另一个示例是预测奶牛的疾病(泊松模型的零膨胀混合)。