Ma Renjun, Islam Md Dedarul, Hasan M Tariqul, Jørgensen Bent
Department of Mathematics and Statistics, University of New Brunswick, Fredericton, NB E3B 5A3, Canada.
Department of Statistics, University of Southern Denmark, DK-5230 Odense, Denmark.
Entropy (Basel). 2023 May 28;25(6):863. doi: 10.3390/e25060863.
Multilevel semicontinuous data occur frequently in medical, environmental, insurance and financial studies. Such data are often measured with covariates at different levels; however, these data have traditionally been modelled with covariate-independent random effects. Ignoring dependence of cluster-specific random effects and cluster-specific covariates in these traditional approaches may lead to ecological fallacy and result in misleading results. In this paper, we propose Tweedie compound Poisson model with covariate-dependent random effects to analyze multilevel semicontinuous data where covariates at different levels are incorporated at relevant levels. The estimation of our models has been developed based on the orthodox best linear unbiased predictor of random effect. Explicit expressions of random effects predictors facilitate computation and interpretation of our models. Our approach is illustrated through the analysis of the basic symptoms inventory study data where 409 adolescents from 269 families were observed at varying number of times from 1 to 17 times. The performance of the proposed methodology was also examined through the simulation studies.
多级半连续数据在医学、环境、保险和金融研究中经常出现。此类数据通常在不同层面上与协变量一起测量;然而,这些数据传统上一直采用与协变量无关的随机效应进行建模。在这些传统方法中,忽略特定聚类随机效应与特定聚类协变量之间的相关性可能会导致生态学谬误,并产生误导性结果。在本文中,我们提出具有协变量依赖随机效应的Tweedie复合泊松模型,以分析多级半连续数据,其中不同层面的协变量在相关层面纳入模型。我们模型的估计是基于随机效应的正统最佳线性无偏预测器发展而来的。随机效应预测器的显式表达式便于我们模型的计算和解释。通过对基本症状量表研究数据的分析来说明我们的方法,该研究观察了来自269个家庭的409名青少年,观察次数从1次到17次不等。还通过模拟研究检验了所提出方法的性能。