Department of Epidemiology, Biostatistics, and Occupational Health, McGill University, Montreal, QC, H3A 1A2, Canada.
Eur J Epidemiol. 2018 Oct;33(10):897-907. doi: 10.1007/s10654-018-0434-4. Epub 2018 Aug 24.
With greater access to regression-based methods for confounder control, the etiologic study with individual matching, analyzed by classical (calculator) methods, lost favor in recent decades. This design was costly, and the data sometimes mis-analyzed. Now, with Big Data, individual matching becomes an economical option. To many, however, conditional logistic regression, commonly used to estimate the incidence density ratio parameter, is somewhat of a black box whose output is not easily checked. An epidemiologist-statistician pair recently proposed a new estimator that is easily applied to data from individually-matched series with a 2:1 ratio (and no other confounding variables) using just a hand calculator or spreadsheet. Surprisingly-or possibly not-they overlooked classical estimators developed in earlier decades. This prompts me to re-introduce some of these, to highlight their considerable flexibility and ease of use, and to update them. Nowadays, for any matching ratio (M:1), the Maximum Likelihood result can be easily computed from data gathered under the matched design in two different ways, each using just the summary data. One is via any binomial regression program that allows offsets, applied to just M 'rows' of data. The other is by hand! The aim of this note is not to save on computation; instead, it is to make connections between classical and regression-based methods, to promote terminology that reflects the concepts and structure of the etiologic study, and to focus attention on what parameter is being estimated.
随着回归方法在混杂因素控制方面的应用越来越广泛,传统的(计算器)个体匹配病因学研究在近几十年来已失宠。这种设计成本高昂,并且数据有时也会被错误分析。现在,随着大数据的发展,个体匹配成为了一种经济的选择。然而,对于许多人来说,条件逻辑回归通常用于估计发病率密度比参数,它有点像一个黑盒子,其输出结果不易检查。最近,一对流行病学家-统计学家提出了一种新的估计量,该估计量易于应用于 2:1 比例的个体匹配系列(且没有其他混杂变量)的数据,仅使用手动计算器或电子表格即可。令人惊讶的是——或者可能并不奇怪——他们忽略了几十年前开发的经典估计量。这促使我重新引入其中的一些估计量,以突出它们相当大的灵活性和易用性,并对其进行更新。如今,对于任何匹配比例(M:1),最大似然结果都可以通过两种不同的方式,从匹配设计下收集的数据中轻松计算得出,每种方式都只使用汇总数据。一种是通过允许偏移量的任何二项式回归程序,仅应用于 M '行'数据。另一种是手动!本说明的目的不是节省计算时间;相反,它旨在建立经典方法和回归方法之间的联系,促进反映病因学研究概念和结构的术语,并将注意力集中在正在估计的参数上。