Chen Chixiang, Shen Biyi, Zhang Lijun, Xue Yuan, Wang Ming
Division of Biostatistics and Bioinformatics, Department of Public Health Sciences, Penn State College of Medicine, Hershey, Pennsylvania.
Department of Biochemistry and Molecular Biology, Institute for Personalized Medicine, Penn State College of Medicine, Hershey, Pennsylvania.
Biometrics. 2019 Sep;75(3):950-965. doi: 10.1111/biom.13060. Epub 2019 Apr 25.
Longitudinal data are common in clinical trials and observational studies, where missing outcomes due to dropouts are always encountered. Under such context with the assumption of missing at random, the weighted generalized estimating equation (WGEE) approach is widely adopted for marginal analysis. Model selection on marginal mean regression is a crucial aspect of data analysis, and identifying an appropriate correlation structure for model fitting may also be of interest and importance. However, the existing information criteria for model selection in WGEE have limitations, such as separate criteria for the selection of marginal mean and correlation structures, unsatisfactory selection performance in small-sample setups, and so forth. In particular, there are few studies to develop joint information criteria for selection of both marginal mean and correlation structures. In this work, by embedding empirical likelihood into the WGEE framework, we propose two innovative information criteria named a joint empirical Akaike information criterion and a joint empirical Bayesian information criterion, which can simultaneously select the variables for marginal mean regression and also correlation structure. Through extensive simulation studies, these empirical-likelihood-based criteria exhibit robustness, flexibility, and outperformance compared to the other criteria including the weighted quasi-likelihood under the independence model criterion, the missing longitudinal information criterion, and the joint longitudinal information criterion. In addition, we provide a theoretical justification of our proposed criteria, and present two real data examples in practice for further illustration.
纵向数据在临床试验和观察性研究中很常见,在这些研究中总是会遇到因失访导致的结局缺失情况。在这种在随机缺失假设下的背景下,加权广义估计方程(WGEE)方法被广泛用于边际分析。边际均值回归的模型选择是数据分析的一个关键方面,确定合适的相关结构进行模型拟合也可能是有意义和重要的。然而,WGEE中现有的模型选择信息准则存在局限性,例如用于选择边际均值和相关结构的单独准则、在小样本设置下不理想的选择性能等等。特别是,很少有研究开发用于同时选择边际均值和相关结构的联合信息准则。在这项工作中,通过将经验似然嵌入到WGEE框架中,我们提出了两个创新的信息准则,即联合经验Akaike信息准则和联合经验贝叶斯信息准则,它们可以同时选择边际均值回归的变量以及相关结构。通过广泛的模拟研究,与其他准则(包括独立模型准则下的加权拟似然、缺失纵向信息准则和联合纵向信息准则)相比,这些基于经验似然的准则表现出稳健性、灵活性和优越性。此外,我们为提出的准则提供了理论依据,并给出了两个实际数据示例以供进一步说明。