Fuyama Kanako, Hagiwara Yasuhiro, Matsuyama Yutaka
Graduate School of Interdisciplinary Information Studies, The University of Tokyo, Tokyo, Japan.
Department of Biostatistics, School of Public Health, The University of Tokyo, 7-3-1, Hongo, Bunkyo-ku, 113-0033, Tokyo, Japan.
Emerg Themes Epidemiol. 2021 Dec 11;18(1):18. doi: 10.1186/s12982-021-00107-2.
Risk ratio is a popular effect measure in epidemiological research. Although previous research has suggested that logistic regression may provide biased odds ratio estimates when the number of events is small and there are multiple confounders, the performance of risk ratio estimation has yet to be examined in the presence of multiple confounders.
We conducted a simulation study to evaluate the statistical performance of three regression approaches for estimating risk ratios: (1) risk ratio interpretation of logistic regression coefficients, (2) modified Poisson regression, and (3) regression standardization using logistic regression. We simulated 270 scenarios with systematically varied sample size, the number of binary confounders, exposure proportion, risk ratio, and outcome proportion. Performance evaluation was based on convergence proportion, bias, standard error estimation, and confidence interval coverage.
With a sample size of 2500 and an outcome proportion of 1%, both logistic regression and modified Poisson regression at times failed to converge, and the three approaches were comparably biased. As the outcome proportion or sample size increased, modified Poisson regression and regression standardization yielded unbiased risk ratio estimates with appropriate confidence intervals irrespective of the number of confounders. The risk ratio interpretation of logistic regression coefficients, by contrast, became substantially biased as the outcome proportion increased.
Regression approaches for estimating risk ratios should be cautiously used when the number of events is small. With an adequate number of events, risk ratios are validly estimated by modified Poisson regression and regression standardization, irrespective of the number of confounders.
风险比是流行病学研究中一种常用的效应量度。尽管先前的研究表明,当事件数量较少且存在多个混杂因素时,逻辑回归可能会提供有偏差的比值比估计,但在存在多个混杂因素的情况下,风险比估计的性能尚未得到检验。
我们进行了一项模拟研究,以评估三种估计风险比的回归方法的统计性能:(1)逻辑回归系数的风险比解释,(2)修正泊松回归,以及(3)使用逻辑回归的回归标准化。我们模拟了270种场景,系统地改变了样本量、二元混杂因素的数量、暴露比例、风险比和结局比例。性能评估基于收敛比例、偏差、标准误差估计和置信区间覆盖范围。
在样本量为2500且结局比例为1%的情况下,逻辑回归和修正泊松回归有时无法收敛,并且这三种方法的偏差相当。随着结局比例或样本量的增加,无论混杂因素的数量如何,修正泊松回归和回归标准化都能产生无偏差的风险比估计,并具有适当的置信区间。相比之下,随着结局比例的增加,逻辑回归系数的风险比解释出现了显著偏差。
当事件数量较少时,应谨慎使用估计风险比的回归方法。在有足够数量的事件时,无论混杂因素的数量如何,修正泊松回归和回归标准化都能有效地估计风险比。