Division of Biostatistics, University of Minnesota, Minneapolis, Minnesota, USA.
Department of Speech and Hearing Science, The Ohio State University, Columbus, Ohio, USA.
Stat Med. 2023 Jul 10;42(15):2619-2636. doi: 10.1002/sim.9740. Epub 2023 Apr 9.
This work is motivated by the need to accurately model a vector of responses related to pediatric functional status using administrative health data from inpatient rehabilitation visits. The components of the responses have known and structured interrelationships. To make use of these relationships in modeling, we develop a two-pronged regularization approach to borrow information across the responses. The first component of our approach encourages joint selection of the effects of each variable across possibly overlapping groups of related responses and the second component encourages shrinkage of effects towards each other for related responses. As the responses in our motivating study are not normally-distributed, our approach does not rely on an assumption of multivariate normality of the responses. We show that with an adaptive version of our penalty, our approach results in the same asymptotic distribution of estimates as if we had known in advance which variables have non-zero effects and which variables have the same effects across some outcomes. We demonstrate the performance of our method in extensive numerical studies and in an application in the prediction of functional status of pediatric patients using administrative health data in a population of children with neurological injury or illness at a large children's hospital.
这项工作的目的是使用来自住院康复就诊的医疗健康数据,准确地对与儿科功能状态相关的响应向量进行建模。响应的组成部分具有已知的结构化相互关系。为了在建模中利用这些关系,我们开发了一种双管齐下的正则化方法,以便在响应之间跨信息借用。我们方法的第一部分鼓励在可能重叠的相关响应组中跨各变量联合选择效果,第二部分鼓励相关响应的效果相互收缩。由于我们研究中的响应不是正态分布的,所以我们的方法不依赖于响应的多元正态性假设。我们表明,通过我们的惩罚的自适应版本,我们的方法可以得到与我们事先知道哪些变量具有非零效果以及哪些变量在某些结果中具有相同效果相同的估计的渐近分布。我们在大量数值研究和在大型儿童医院的神经损伤或疾病患儿人群中使用医疗健康数据对儿科患者功能状态进行预测的应用中,展示了我们方法的性能。