Department of Biostatistics, University of Florida, Gainesville, FL 32611, USA.
Stat Med. 2011 Apr 30;30(9):965-72. doi: 10.1002/sim.4185. Epub 2011 Feb 2.
Recently, we examined methods of adjusting for confounding by neighborhood of an individual exposure effect on a binary outcome, using complex survey data; the methods were found to fail when the neighborhood sample sizes are small and the selection bias is strongly informative. More recently, other authors have adapted an older method from the genetics literature for application to complex survey data; their adaptation achieves a consistent estimator under a broad range of circumstances. The method is based on weighted pseudolikelihoods, in which the contribution from each neighborhood involves all pairs of cases and controls in the neighborhood. The pairs are treated as if they were independent, a pairwise pseudo-conditional likelihood is thus derived, and then the corresponding score equation is weighted with inverse-probabilities of sampling each case-control pair. We have greatly simplified the implementation by translating the pairwise pseudo-conditional likelihood into an equivalent ordinary weighted log-likelihood formulation. We show how to program the method using standard software for ordinary logistic regression with complex survey data (e.g. SAS PROC SURVEYLOGISTIC). We also show that the methodology applies to a broader set of sampling scenarios than the ones considered by the previous authors. We demonstrate the validity of our simplified implementation by applying it to a simulation for which previous methods failed; the new method performs beautifully. We also apply the new method to an analysis of 2009 National Health Interview Survey (NHIS) public-use data, to estimate the effect of education on health insurance coverage, adjusting for confounding by neighborhood.
最近,我们使用复杂的调查数据,研究了调整个体暴露对二分类结果的混杂作用的邻里效应的方法;当邻里样本量较小时,并且选择偏差具有很强的信息性时,发现这些方法会失效。最近,其他作者从遗传学文献中改编了一种较旧的方法,用于复杂的调查数据;他们的改编在广泛的情况下实现了一致的估计量。该方法基于加权伪似然,其中每个邻里的贡献涉及邻里中的所有病例和对照对。将这些对视为独立的,从而得出了一个配对的伪条件似然,然后用每个病例-对照对的抽样逆概率对相应的得分方程进行加权。我们通过将成对的伪条件似然转换为等效的普通加权对数似然公式,大大简化了实现过程。我们展示了如何使用带有复杂调查数据的普通逻辑回归的标准软件(例如 SAS PROC SURVEYLOGISTIC)来编程该方法。我们还表明,该方法适用于比以前作者考虑的采样方案更广泛的采样场景。我们通过将其应用于先前方法失败的模拟来证明我们简化实现的有效性;新方法表现出色。我们还将新方法应用于对 2009 年国家健康访谈调查(NHIS)公共使用数据的分析,以估计教育对健康保险覆盖范围的影响,同时调整邻里因素的混杂作用。