一种具有信息量的聚类大小数据的联合建模方法：对聚类大小模型的稳健性。

A joint modeling approach to data with informative cluster size: robustness to the cluster size model.

机构信息

Biostatistics and Bioinformatics Branch, Division of Epidemiology, Statistics and Prevention Research, Eunice Kennedy Shriver National Institute of Child Health and Human Development, National Institutes of Health, Rockville, MD 20852, U.S.A.

出版信息

Stat Med. 2011 Jul 10;30(15):1825-36. doi: 10.1002/sim.4239. Epub 2011 Apr 15.

DOI:10.1002/sim.4239

PMID:21495060

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3115426/

Abstract

In many biomedical and epidemiological studies, data are often clustered due to longitudinal follow up or repeated sampling. While in some clustered data the cluster size is pre-determined, in others it may be correlated with the outcome of subunits, resulting in informative cluster size. When the cluster size is informative, standard statistical procedures that ignore cluster size may produce biased estimates. One attractive framework for modeling data with informative cluster size is the joint modeling approach in which a common set of random effects are shared by both the outcome and cluster size models. In addition to making distributional assumptions on the shared random effects, the joint modeling approach needs to specify the cluster size model. Questions arise as to whether the joint modeling approach is robust to misspecification of the cluster size model. In this paper, we studied both asymptotic and finite-sample characteristics of the maximum likelihood estimators in joint models when the cluster size model is misspecified. We found that using an incorrect distribution for the cluster size may induce small to moderate biases, while using a misspecified functional form for the shared random parameter in the cluster size model results in nearly unbiased estimation of outcome model parameters. We also found that there is little efficiency loss under this model misspecification. A developmental toxicity study was used to motivate the research and to demonstrate the findings.

摘要

在许多生物医学和流行病学研究中，由于纵向随访或重复采样，数据通常是聚类的。虽然在某些聚类数据中，聚类大小是预先确定的，但在其他情况下，它可能与亚单位的结果相关，从而导致信息丰富的聚类大小。当聚类大小时，忽略聚类大小时的标准统计程序可能会产生有偏差的估计。一种用于对具有信息丰富聚类大小的数据进行建模的有吸引力的框架是联合建模方法，其中共同的一组随机效应同时用于结果和聚类大小模型。除了对共享随机效应进行分布假设外，联合建模方法还需要指定聚类大小模型。对于聚类大小模型的指定是否会影响联合建模方法的稳健性，存在一些问题。在本文中，我们研究了当聚类大小模型指定不正确时，联合模型中最大似然估计的渐近和有限样本特征。我们发现，使用不正确的聚类大小分布可能会导致小到中度的偏差，而在聚类大小模型中使用不正确的共享随机参数函数形式会导致对结果模型参数的近乎无偏差估计。我们还发现，在这种模型指定错误下，效率损失很小。发育毒性研究被用来激发研究并展示研究结果。

相似文献

A joint modeling approach to data with informative cluster size: robustness to the cluster size model.一种具有信息量的聚类大小数据的联合建模方法：对聚类大小模型的稳健性。

Stat Med. 2011 Jul 10;30(15):1825-36. doi: 10.1002/sim.4239. Epub 2011 Apr 15.

A Bayesian approach for joint modeling of cluster size and subunit-specific outcomes.一种用于聚类大小和亚单位特异性结果联合建模的贝叶斯方法。

Biometrics. 2003 Sep;59(3):521-30. doi: 10.1111/1541-0420.00062.

Folic acid supplementation and malaria susceptibility and severity among people taking antifolate antimalarial drugs in endemic areas.在流行地区，服用抗叶酸抗疟药物的人群中，叶酸补充剂与疟疾易感性和严重程度的关系。

Cochrane Database Syst Rev. 2022 Feb 1;2(2022):CD014217. doi: 10.1002/14651858.CD014217.

Comments about Joint Modeling of Cluster Size and Binary and Continuous Subunit-Specific Outcomes.关于聚类大小以及二元和连续亚组特异性结局的联合建模的评论

Biometrics. 2005 Sep;61(3):862-6; discussion 866-7. doi: 10.1111/j.1541-020X.2005.00409_1.x.

Modeling of correlated data with informative cluster sizes: An evaluation of joint modeling and within-cluster resampling approaches.具有信息性聚类大小的相关数据建模：联合建模和聚类内重采样方法的评估。

Stat Methods Med Res. 2017 Aug;26(4):1881-1895. doi: 10.1177/0962280215592268. Epub 2015 Jun 24.

Association models for clustered data with binary and continuous responses.具有二元和连续响应的聚类数据的关联模型。

Biometrics. 2010 Mar;66(1):287-93. doi: 10.1111/j.1541-0420.2008.01232.x. Epub 2009 May 7.

On random sample size, ignorability, ancillarity, completeness, separability, and degeneracy: sequential trials, random sample sizes, and missing data.关于随机样本量、可忽略性、辅助性、完整性、可分离性和退化性：序贯试验、随机样本量和缺失数据。

Stat Methods Med Res. 2014 Feb;23(1):11-41. doi: 10.1177/0962280212445801. Epub 2012 Apr 18.

A semiparametric joint model for cluster size and subunit-specific interval-censored outcomes.一种用于群组大小和亚基特定区间截断结局的半参数联合模型。

Biometrics. 2023 Sep;79(3):2010-2022. doi: 10.1111/biom.13795. Epub 2022 Dec 15.

Estimation of covariate effects in generalized linear mixed models with a misspecified distribution of random intercepts and slopes.在随机截距和斜率分布指定不当的广义线性混合模型中估计协变量效应。

Stat Med. 2013 Jun 30;32(14):2419-29. doi: 10.1002/sim.5682. Epub 2012 Dec 2.

Random effects modeling of multiple binomial responses using the multivariate binomial logit-normal distribution.

Biometrics. 2000 Mar;56(1):73-80. doi: 10.1111/j.0006-341x.2000.00073.x.

引用本文的文献

A semiparametric joint model for cluster size and subunit-specific interval-censored outcomes.一种用于群组大小和亚基特定区间截断结局的半参数联合模型。

Biometrics. 2023 Sep;79(3):2010-2022. doi: 10.1111/biom.13795. Epub 2022 Dec 15.

Risk prediction in multicentre studies when there is confounding by cluster or informative cluster size.多中心研究中存在簇或信息簇大小混杂时的风险预测。

BMC Med Res Methodol. 2021 Jul 4;21(1):135. doi: 10.1186/s12874-021-01321-x.

Neutral diagnosis: An innovative concept for medical device clinical trials.中性诊断：医疗器械临床试验的一个创新概念。

Contemp Clin Trials Commun. 2019 Aug 21;16:100436. doi: 10.1016/j.conctc.2019.100436. eCollection 2019 Dec.

Pattern-mixture models with incomplete informative cluster size: Application to a repeated pregnancy study.具有不完全信息聚类大小的模式混合模型：在重复妊娠研究中的应用。

J R Stat Soc Ser C Appl Stat. 2018 Jan;67(1):255-273. doi: 10.1111/rssc.12226. Epub 2017 Jun 15.

Inferring marginal association with paired and unpaired clustered data.推断配对和非配对聚类数据的边缘关联。

Stat Methods Med Res. 2018 Jun;27(6):1806-1817. doi: 10.1177/0962280216669184. Epub 2016 Sep 20.

Cluster adjusted regression for displaced subject data (CARDS): Marginal inference under potentially informative temporal cluster size profiles.针对失访受试者数据的聚类调整回归（CARDS）：在潜在信息性时间聚类规模分布下的边际推断。

Biometrics. 2016 Jun;72(2):441-51. doi: 10.1111/biom.12456. Epub 2015 Dec 18.

Review of methods for handling confounding by cluster and informative cluster size in clustered data.聚类数据中处理聚类混杂和信息性聚类大小的方法综述。

Stat Med. 2014 Dec 30;33(30):5371-87. doi: 10.1002/sim.6277. Epub 2014 Aug 4.

Methods for observed-cluster inference when cluster size is informative: a review and clarifications.当聚类大小具有信息性时的观察聚类推断方法：综述与阐释

Biometrics. 2014 Jun;70(2):449-56. doi: 10.1111/biom.12151. Epub 2014 Jan 30.

A model for repeated clustered data with informative cluster sizes.具有信息性簇大小的重复聚类数据模型。

Stat Med. 2014 Feb 28;33(5):738-59. doi: 10.1002/sim.5988. Epub 2013 Sep 30.

本文引用的文献

Robustness of a parametric model for informatively censored bivariate longitudinal data under misspecification of its distributional assumptions: A simulation study.分布假设设定错误下信息删失双变量纵向数据参数模型的稳健性：一项模拟研究

Stat Med. 2007 Dec 30;26(30):5473-85. doi: 10.1002/sim.3147.

Joint modeling of survival and longitudinal data: likelihood approach revisited.生存数据与纵向数据的联合建模：似然方法再探讨

Biometrics. 2006 Dec;62(4):1037-43. doi: 10.1111/j.1541-0420.2006.00570.x.

Comments about Joint Modeling of Cluster Size and Binary and Continuous Subunit-Specific Outcomes.关于聚类大小以及二元和连续亚组特异性结局的联合建模的评论

Biometrics. 2005 Sep;61(3):862-6; discussion 866-7. doi: 10.1111/j.1541-020X.2005.00409_1.x.

A Bayesian approach for joint modeling of cluster size and subunit-specific outcomes.一种用于聚类大小和亚单位特异性结果联合建模的贝叶斯方法。

Biometrics. 2003 Sep;59(3):521-30. doi: 10.1111/1541-0420.00062.

Multiple outputation: inference for complex clustered data by averaging analyses from independent data.多重输出：通过对独立数据的分析求平均值来推断复杂聚类数据。

Biometrics. 2003 Jun;59(2):420-9. doi: 10.1111/1541-0420.00049.

Marginal analyses of clustered data when cluster size is informative.当聚类大小具有信息性时对聚类数据的边际分析。

Biometrics. 2003 Mar;59(1):36-42. doi: 10.1111/1541-0420.00005.

A semiparametric likelihood approach to joint modeling of longitudinal and time-to-event data.一种用于纵向数据和事件发生时间数据联合建模的半参数似然方法。

Biometrics. 2002 Dec;58(4):742-53. doi: 10.1111/j.0006-341x.2002.00742.x.

Linear mixed models with flexible distributions of random effects for longitudinal data.用于纵向数据的具有灵活随机效应分布的线性混合模型。

Biometrics. 2001 Sep;57(3):795-802. doi: 10.1111/j.0006-341x.2001.00795.x.

Multi-level modelling of conception in artificial insemination by donor.

Stat Med. 1998 May 30;17(10):1137-56. doi: 10.1002/(sici)1097-0258(19980530)17:10<1137::aid-sim822>3.0.co;2-1.

Slope estimation in the presence of informative right censoring: modeling the number of observations as a geometric random variable.存在信息性右删失情况下的斜率估计：将观测值数量建模为几何随机变量。

Biometrics. 1994 Mar;50(1):39-50.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验