Fears T R, Brown C C
Biometrics. 1986 Dec;42(4):955-60.
There are a number of possible designs for case-control studies. The simplest uses two separate simple random samples, but an actual study may use more complex sampling procedures. Typically, stratification is used to control for the effects of one or more risk factors in which we are interested. It has been shown (Anderson, 1972, Biometrika 59, 19-35; Prentice and Pyke, 1979, Biometrika 66, 403-411) that the unconditional logistic regression estimators apply under stratified sampling, so long as the logistic model includes a term for each stratum. We consider the case-control problem with stratified samples and assume a logistic model that does not include terms for strata, i.e., for fixed covariates the (prospective) probability of disease does not depend on stratum. We assume knowledge of the proportion sampled in each stratum as well as the total number in the stratum. We use this knowledge to obtain the maximum likelihood estimators for all parameters in the logistic model including those for variables completely associated with strata. The approach may also be applied to obtain estimators under probability sampling.
病例对照研究有多种可能的设计。最简单的设计使用两个独立的简单随机样本,但实际研究可能会采用更复杂的抽样程序。通常,分层用于控制我们感兴趣的一个或多个风险因素的影响。已有研究表明(安德森,1972年,《生物统计学》第59卷,第19 - 35页;普伦蒂斯和派克,1979年,《生物统计学》第66卷,第403 - 411页),只要逻辑模型为每个分层包含一项,无条件逻辑回归估计量在分层抽样下适用。我们考虑分层样本的病例对照问题,并假设一个不包含分层项的逻辑模型,即对于固定协变量,疾病的(前瞻性)概率不依赖于分层。我们假设知道每个分层中抽样的比例以及该分层中的总数。我们利用这些信息来获得逻辑模型中所有参数的最大似然估计量,包括那些与分层完全相关的变量的参数。该方法也可用于在概率抽样下获得估计量。