Austin Erin, Pan Wei, Shen Xiaotong
Department of Mathematical and Statistical Sciences, University of Colorado Denver, 80204.
Division of Biostatistics, School of Public Health, University of Minnesota, Minneapolis, MN 55455.
Stat Sin. 2020 Apr;30(2):783-807. doi: 10.5705/ss.202016.0531.
For some modeling problems a population may be better assessed as an aggregate of unknown subpopulations, each with a distinct relationship between a response and associated variables. The finite mixture of regressions (FMR) model, in which an outcome is derived from one of a finite number of linear regression models, is a natural tool in this setting. In this article, we first propose a new penalized regression approach. Then, we demonstrate how the proposed approach better identifies subpopulations and their corresponding models than a semiparametric FMR method does. Our new method fits models for each person via grouping pursuit, utilizing a new group-truncated penalty that shrinks the differences between estimated parameter vectors. The methodology causes the individuals' models to cluster into a few common models, in turn revealing previously unknown subpopulations. In fact, by varying the penalty strength, the new method can reveal a hierarchical structure among the subpopulations that can be useful in exploratory analyses. Simulations using FMR models and a real-data analysis show that the method performs promisingly well.
对于一些建模问题,将总体视为未知子总体的集合可能会得到更好的评估,每个子总体在响应变量和相关变量之间都有独特的关系。有限混合回归(FMR)模型中,结果是从有限数量的线性回归模型之一得出的,在这种情况下它是一种自然的工具。在本文中,我们首先提出一种新的惩罚回归方法。然后,我们证明了与半参数FMR方法相比,所提出的方法如何能更好地识别子总体及其相应的模型。我们的新方法通过分组追踪为每个人拟合模型,利用一种新的组截断惩罚来缩小估计参数向量之间的差异。该方法使个体模型聚集成几个常见模型,进而揭示出先前未知的子总体。事实上,通过改变惩罚强度,新方法可以揭示子总体之间的层次结构,这在探索性分析中可能会很有用。使用FMR模型的模拟和实际数据分析表明该方法表现良好。