Wacholder S, Weinberg C R
Biostatistics Branch, National Cancer Institute, Rockville, Maryland 20852.
Biometrics. 1994 Jun;50(2):350-7.
Case-control studies can often be made more efficient by using frequency matching, randomized recruitment, stratified sampling, or two-stage sampling. These designs share two common features: (1) some "first-stage" variables are ascertained for all study subjects, while complete variable ascertainment is carried out for only a selected subsample, and (2) the subsampling of subjects for "second-stage" variable ascertainment depends jointly on their disease status and their observed first-stage variables. Because first-stage variables alter the subsampling fractions, standard analyses require a multiplicative specification of any joint effects of a second- and a first-stage variable. We show that by making use of missing data methods, maximum likelihood estimates can be obtained for risk parameters of interest, even those characterizing interactions between first- and second-stage variables. Joint effects can thus be modelled flexibly, with allowance for both additive and multiplicative models. Preliminary data from a case-control study of lung cancer as related to age, sex, and smoking provide an example, leading to the suggestion that the combined effect of age and smoking is multiplicative.
病例对照研究通常可以通过使用频率匹配、随机招募、分层抽样或两阶段抽样来提高效率。这些设计有两个共同特点:(1)为所有研究对象确定一些“第一阶段”变量,而仅对选定的子样本进行完整的变量确定;(2)用于“第二阶段”变量确定的对象子抽样共同取决于他们的疾病状态和观察到的第一阶段变量。由于第一阶段变量会改变子抽样比例,标准分析需要对第二阶段和第一阶段变量的任何联合效应进行乘法规范。我们表明,通过使用缺失数据方法,可以获得感兴趣的风险参数的最大似然估计值,即使是那些表征第一阶段和第二阶段变量之间相互作用的参数。因此,可以灵活地对联合效应进行建模,同时考虑加法模型和乘法模型。一项关于肺癌与年龄、性别和吸烟相关性的病例对照研究的初步数据提供了一个例子,表明年龄和吸烟的联合效应是相乘的。