Suppr超能文献

含未测量混杂因素的纵向数据分析。

Analysis of longitudinal data with unmeasured confounders.

作者信息

Palta M, Yao T J

机构信息

Biostatistics Center, Madison, Wisconsin 53706.

出版信息

Biometrics. 1991 Dec;47(4):1355-69.

PMID:1786323
Abstract

Confounding in longitudinal or clustered data creates special problems and opportunities because the relationship between the confounder and covariate of interest may differ across and within individuals or clusters. A well-known example of such confounding in longitudinal data is the presence of cohort and period effects in models of aging in epidemiologic research. We first formulate a data-generating model with confounding and derive the distribution of the response variable unconditional on the confounder. We then examine the properties of the regression coefficient for some analytic approaches when the confounder is omitted from the fitted model. The expected value of the regression coefficient differs in across- and within-individual regression. In the multivariate case, within- and between-individual information is combined and weighted according to the assumed covariance structure. We assume compound symmetry in the fitted covariance matrix and derive the variance, bias, and mean squared error of the slope estimate as a function of the fitted within-individual correlation. We find that even in this simplest multivariate case, the trade-off between bias and variance depends on a large number of parameters. It is generally preferable to fit correlations somewhat above the true correlation to minimize the effect of between-individual confounders or cohort effects. Period effects can lead to situations where it is advantageous to fit correlations that are below the true correlation. The results highlight the trade-offs inherent in the choice of method for analysis of longitudinal data, and show that an appropriate choice can be made only after determining whether within- or between-individual confounding is the major concern.

摘要

纵向数据或聚类数据中的混杂会带来特殊的问题和机遇,因为混杂因素与感兴趣的协变量之间的关系在个体或聚类之间以及个体或聚类内部可能有所不同。纵向数据中此类混杂的一个著名例子是流行病学研究中衰老模型中存在队列效应和时期效应。我们首先构建一个存在混杂的数据生成模型,并推导在不考虑混杂因素的情况下响应变量的分布。然后,我们研究当在拟合模型中省略混杂因素时,某些分析方法的回归系数的性质。回归系数的期望值在个体间回归和个体内回归中有所不同。在多变量情况下,个体内和个体间的信息会根据假定的协方差结构进行组合和加权。我们假设拟合的协方差矩阵具有复合对称性,并推导斜率估计的方差、偏差和均方误差作为拟合个体内相关性的函数。我们发现,即使在这个最简单的多变量情况下,偏差和方差之间的权衡也取决于大量参数。通常最好将相关性拟合得略高于真实相关性,以尽量减少个体间混杂因素或队列效应的影响。时期效应可能导致拟合低于真实相关性的相关性会更有利的情况。结果突出了纵向数据分析方法选择中固有的权衡,并表明只有在确定个体内或个体间混杂是否是主要关注点之后,才能做出合适的选择。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验