Liu Shuling, Manatunga Amita K, Peng Limin, Marcus Michele
Department of Biotatistics and Bioinformatics, Emory University, Atlanta, Georgia, U.S.A.
Department of Epidemiology, Emory University, Atlanta, Georgia, U.S.A.
Biometrics. 2017 Jun;73(2):666-677. doi: 10.1111/biom.12588. Epub 2016 Oct 4.
In many biomedical studies that involve correlated data, an outcome is often repeatedly measured for each individual subject along with the number of these measurements, which is also treated as an observed outcome. This type of data has been referred as multivariate random length data by Barnhart and Sampson (1995). A common approach to handling such type of data is to jointly model the multiple measurements and the random length. In previous literature, a key assumption is the multivariate normality for the multiple measurements. Motivated by a reproductive study, we propose a new copula-based joint model which relaxes the normality assumption. Specifically, we adopt the Clayton-Oakes model for multiple measurements with flexible marginal distributions specified as semi-parametric transformation models. The random length is modeled via a generalized linear model. We develop an approximate EM algorithm to derive parameter estimators and standard errors of the estimators are obtained through bootstrapping procedures and the finite-sample performance of the proposed method is investigated using simulation studies. We apply our method to the Mount Sinai Study of Women Office Workers (MSSWOW), where women were prospectively followed for 1 year for studying fertility.
在许多涉及相关数据的生物医学研究中,通常会对每个个体受试者的某个结果进行多次测量,并记录测量次数,测量次数也被视为一个观测结果。Barnhart和Sampson(1995年)将这类数据称为多变量随机长度数据。处理这类数据的一种常见方法是对多次测量和随机长度进行联合建模。在以往的文献中,一个关键假设是多次测量服从多元正态分布。受一项生殖研究的启发,我们提出了一种基于copula的新联合模型,该模型放宽了正态性假设。具体来说,我们采用Clayton-Oakes模型进行多次测量,并将灵活的边际分布指定为半参数变换模型。随机长度通过广义线性模型进行建模。我们开发了一种近似期望最大化(EM)算法来推导参数估计量,并通过自助法程序获得估计量的标准误差,同时使用模拟研究来考察所提方法的有限样本性能。我们将我们的方法应用于西奈山职业女性研究(MSSWOW),该研究对女性进行了为期1年的前瞻性跟踪以研究生育能力。