Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, P.O. Box 85500, 3508 GA Utrecht, The Netherlands; Division of Pharmacoepidemiology and Clinical Pharmacology, Utrecht Institute for Pharmaceutical Sciences, P.O. Box 80082, 3508 TB Utrecht, The Netherlands; Department of Farm Animal Health, Faculty of Veterinary Medicine, Utrecht University, Yalelaan 107, 3584 CL Utrecht, The Netherlands.
Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, P.O. Box 85500, 3508 GA Utrecht, The Netherlands; Division of Pharmacoepidemiology and Clinical Pharmacology, Utrecht Institute for Pharmaceutical Sciences, P.O. Box 80082, 3508 TB Utrecht, The Netherlands.
J Clin Epidemiol. 2014 Jul;67(7):821-9. doi: 10.1016/j.jclinepi.2014.02.008. Epub 2014 Apr 24.
To give a comprehensive comparison of the performance of commonly applied interaction tests.
A literature review and simulation study was performed evaluating interaction tests on the odds ratio (OR) or the risk difference (RD) scales: Cochran Q (Q), Breslow-Day (BD), Tarone, unconditional score, likelihood ratio (LR), Wald, and relative excess risk due to interaction (RERI)-based tests.
Review results agreed with results from our simulation study, which showed that on the OR scale, in small sample sizes (eg, number of subjects ≤ 250) the type 1 error rates of the LR test was 0.10; the BD and Tarone tests showed results around 0.05. On the RD scale, the LR and RERI tests had error rates around 0.05. On both scales, tests did not differ regarding power. When exposure prevented the outcome RERI-based tests were relatively underpowered (eg, N = 100; RERI power = 5% vs. Wald power = 18%). With increasing sample size, difference decreased.
In small samples, interaction tests differed. On the OR scale, the Tarone and BD tests are recommended. On the RD scale, the LR and RERI-based tests performed best. However, RERI-based tests are underpowered compared with other tests, when exposure prevents the outcome, and sample size is limited.
全面比较常用交互检验的性能。
对基于比值比(OR)或风险差异(RD)尺度的交互检验(Cochran Q(Q)、Breslow-Day(BD)、Tarone、无条件评分、似然比(LR)、 Wald 和交互归因相对超额风险(RERI)检验)进行文献回顾和模拟研究。
综述结果与我们的模拟研究结果一致,结果表明,在小样本量(例如,受试者数量≤250)下,LR 检验的Ⅰ类错误率为 0.10;BD 和 Tarone 检验的结果接近 0.05。在 RD 尺度上,LR 和 RERI 检验的错误率约为 0.05。在这两个尺度上,检验的效能没有差异。当暴露阻止了结果时,基于 RERI 的检验相对效能不足(例如,N=100;RERI 效能=5%,而 Wald 效能=18%)。随着样本量的增加,差异减小。
在小样本中,交互检验存在差异。在 OR 尺度上,推荐使用 Tarone 和 BD 检验。在 RD 尺度上,LR 和基于 RERI 的检验表现最佳。然而,当暴露阻止了结果,并且样本量有限时,与其他检验相比,基于 RERI 的检验效能不足。