基于模型的标准化方法，用于利用复杂调查数据对未测量的聚类水平混杂因素进行调整。

Model-based standardization to adjust for unmeasured cluster-level confounders with complex survey data.

作者信息

Cai Zhuangyu, Brumback Babette A

机构信息

Department of Biostatistics, University of Florida, Gainesville, FL, 32611, U.S.A.

出版信息

Stat Med. 2015 Jul 10;34(15):2368-80. doi: 10.1002/sim.6504. Epub 2015 Apr 8.

DOI:10.1002/sim.6504

PMID:25851438

Abstract

Model-based standardization uses a statistical model to estimate a standardized, or unconfounded, population-averaged effect. With it, one can compare groups had the distribution of confounders been identical in both groups to that of the standard population. We develop two methods for model-based standardization with complex survey data that accommodate a categorical confounder that clusters the individual observations into a very large number of subgroups. The first method combines a random-intercept generalized linear mixed model with a conditional pseudo-likelihood estimator of the fixed effects. The second method combines a between-within generalized linear mixed model with census data on the cluster-level means of the individual-level covariates. We conduct simulation studies to compare the two approaches. We apply the two methods to the 2008 Florida Behavioral Risk Factor Surveillance System survey data to estimate standardized proportions of people who drink alcohol, within age groups, adjusting for measured individual-level and unmeasured cluster-level confounders.

摘要

基于模型的标准化使用统计模型来估计标准化的或无混杂因素的总体平均效应。有了它，人们可以比较在混杂因素分布上两组与标准总体分布相同的情况。我们开发了两种用于基于模型的复杂调查数据标准化的方法，这些数据包含一个分类混杂因素，该因素将个体观测值聚类为大量子组。第一种方法将随机截距广义线性混合模型与固定效应的条件伪似然估计器相结合。第二种方法将组间 - 组内广义线性混合模型与个体水平协变量的聚类水平均值的普查数据相结合。我们进行模拟研究以比较这两种方法。我们将这两种方法应用于2008年佛罗里达行为风险因素监测系统调查数据，以估计各年龄组内饮酒者的标准化比例，同时调整已测量的个体水平和未测量的聚类水平混杂因素。