Jiang Fei, Tian Lu, Kang Jian, Li Lexin
University of California at San Francesco, Stanford University.
University of Michigan, University of California at Berkeley.
Stat Sin. 2025 Jul;35(3):1713-1736. doi: 10.5705/ss.202023.0075.
Classical regression generally assumes that all subjects follow a common model with the same set of parameters. With ever advancing capabilities of modern technologies to collect more subjects and more covariates, it has become increasingly common that there exist subgroups of subjects, and each group follows a different regression model with a different set of parameters. In this article, we propose a new approach for subgroup analysis in regression modeling. Specifically, we model the relation between a response and a set of primary predictors, while we explicitly model the heterogenous association given another set of auxiliary predictors, through the interaction between the primary and auxiliary variables. We introduce penalties to induce the sparsity and group structures within the regression coefficients, and to achieve simultaneous feature selection for both primary predictors that are significantly associated with the response, as well as the auxiliary predictors that define the subgroups. We establish the asymptotic guarantees in terms of parameter estimation consistency and cluster estimation consistency. We illustrate our method with an analysis of the functional magnetic resonance imaging data from the Adolescent Brain Cognitive Development Study.
经典回归通常假设所有受试者都遵循具有相同参数集的共同模型。随着现代技术收集更多受试者和更多协变量的能力不断提高,存在受试者亚组的情况越来越普遍,并且每个组遵循具有不同参数集的不同回归模型。在本文中,我们提出了一种回归建模中亚组分析的新方法。具体来说,我们对响应变量与一组主要预测变量之间的关系进行建模,同时通过主要变量和辅助变量之间的相互作用,明确地对给定另一组辅助预测变量的异质关联进行建模。我们引入惩罚项以诱导回归系数中的稀疏性和组结构,并实现与响应显著相关的主要预测变量以及定义亚组的辅助预测变量的同时特征选择。我们在参数估计一致性和聚类估计一致性方面建立了渐近保证。我们通过对青少年大脑认知发展研究的功能磁共振成像数据进行分析来说明我们的方法。