在小样本中探索交互作用会增加假阳性和假阴性发现的发生率：系统评价和模拟研究的结果。

Exploring interaction effects in small samples increases rates of false-positive and false-negative findings: results from a systematic review and simulation study.

机构信息

Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, P.O. Box 85500, 3508 GA Utrecht, The Netherlands; Division of Pharmacoepidemiology and Clinical Pharmacology, Utrecht Institute for Pharmaceutical Sciences, P.O. Box 80082, 3508 TB Utrecht, The Netherlands; Department of Farm Animal Health, Faculty of Veterinary Medicine, Utrecht University, Yalelaan 107, 3584 CL Utrecht, The Netherlands.

出版信息

J Clin Epidemiol. 2014 Jul;67(7):821-9. doi: 10.1016/j.jclinepi.2014.02.008. Epub 2014 Apr 24.

DOI:10.1016/j.jclinepi.2014.02.008

PMID:24768005

Abstract

OBJECTIVE

To give a comprehensive comparison of the performance of commonly applied interaction tests.

METHODS

A literature review and simulation study was performed evaluating interaction tests on the odds ratio (OR) or the risk difference (RD) scales: Cochran Q (Q), Breslow-Day (BD), Tarone, unconditional score, likelihood ratio (LR), Wald, and relative excess risk due to interaction (RERI)-based tests.

RESULTS

Review results agreed with results from our simulation study, which showed that on the OR scale, in small sample sizes (eg, number of subjects ≤ 250) the type 1 error rates of the LR test was 0.10; the BD and Tarone tests showed results around 0.05. On the RD scale, the LR and RERI tests had error rates around 0.05. On both scales, tests did not differ regarding power. When exposure prevented the outcome RERI-based tests were relatively underpowered (eg, N = 100; RERI power = 5% vs. Wald power = 18%). With increasing sample size, difference decreased.

CONCLUSION

In small samples, interaction tests differed. On the OR scale, the Tarone and BD tests are recommended. On the RD scale, the LR and RERI-based tests performed best. However, RERI-based tests are underpowered compared with other tests, when exposure prevents the outcome, and sample size is limited.

摘要

目的

全面比较常用交互检验的性能。

方法

对基于比值比（OR）或风险差异（RD）尺度的交互检验（Cochran Q（Q）、Breslow-Day（BD）、Tarone、无条件评分、似然比（LR）、 Wald 和交互归因相对超额风险（RERI）检验）进行文献回顾和模拟研究。

结果

综述结果与我们的模拟研究结果一致，结果表明，在小样本量（例如，受试者数量≤250）下，LR 检验的Ⅰ类错误率为 0.10；BD 和 Tarone 检验的结果接近 0.05。在 RD 尺度上，LR 和 RERI 检验的错误率约为 0.05。在这两个尺度上，检验的效能没有差异。当暴露阻止了结果时，基于 RERI 的检验相对效能不足（例如，N=100；RERI 效能=5%，而 Wald 效能=18%）。随着样本量的增加，差异减小。

结论

在小样本中，交互检验存在差异。在 OR 尺度上，推荐使用 Tarone 和 BD 检验。在 RD 尺度上，LR 和基于 RERI 的检验表现最佳。然而，当暴露阻止了结果，并且样本量有限时，与其他检验相比，基于 RERI 的检验效能不足。

相似文献

Exploring interaction effects in small samples increases rates of false-positive and false-negative findings: results from a systematic review and simulation study.在小样本中探索交互作用会增加假阳性和假阴性发现的发生率：系统评价和模拟研究的结果。

J Clin Epidemiol. 2014 Jul;67(7):821-9. doi: 10.1016/j.jclinepi.2014.02.008. Epub 2014 Apr 24.

Comparison of three tests of homogeneity of odds ratios in multicenter trials with unequal sample sizes within and among centers.多中心试验中，当各中心样本量不同时，比较三种检验比值比一致性的方法。

BMC Med Res Methodol. 2011 Apr 26;11:58. doi: 10.1186/1471-2288-11-58.

A simulation study for comparing testing statistics in response-adaptive randomization.一种用于比较响应自适应随机化中检验统计量的仿真研究。

BMC Med Res Methodol. 2010 Jun 5;10:48. doi: 10.1186/1471-2288-10-48.

Estimation of the relative excess risk due to interaction and associated confidence bounds.交互作用所致相对超额风险的估计及相关置信区间

Am J Epidemiol. 2009 Mar 15;169(6):756-60. doi: 10.1093/aje/kwn411. Epub 2009 Feb 11.

Folic acid supplementation and malaria susceptibility and severity among people taking antifolate antimalarial drugs in endemic areas.在流行地区，服用抗叶酸抗疟药物的人群中，叶酸补充剂与疟疾易感性和严重程度的关系。

Cochrane Database Syst Rev. 2022 Feb 1;2(2022):CD014217. doi: 10.1002/14651858.CD014217.

Subgroup analyses in randomised controlled trials: quantifying the risks of false-positives and false-negatives.随机对照试验中的亚组分析：量化假阳性和假阴性风险

Health Technol Assess. 2001;5(33):1-56. doi: 10.3310/hta5330.

Performance of model-based vs. permutation tests in the HEALing (Helping to End Addiction Long-term) Communities Study, a covariate-constrained cluster randomized trial.基于模型的检验与置换检验在 HEALing（帮助长期戒除毒瘾）社区研究中的表现，这是一项协变量约束的聚类随机试验。

Trials. 2022 Sep 8;23(1):762. doi: 10.1186/s13063-022-06708-9.

Diagnostic test accuracy of nutritional tools used to identify undernutrition in patients with colorectal cancer: a systematic review.用于识别结直肠癌患者营养不良的营养评估工具的诊断测试准确性：一项系统综述

JBI Database System Rev Implement Rep. 2015 May 15;13(4):141-87. doi: 10.11124/jbisrir-2015-1673.

Exact and asymptotic tests for homogeneity in several 2 x 2 tables.多个2×2列联表齐性的精确检验和渐近检验。

Stat Med. 1999 Apr 30;18(8):893-906. doi: 10.1002/(sici)1097-0258(19990430)18:8<893::aid-sim84>3.0.co;2-5.

Testing the non-unity of rate ratio under inverse sampling.在逆抽样情况下检验率比的非一致性。

Biom J. 2007 Aug;49(4):551-64. doi: 10.1002/bimj.200610337.

引用本文的文献

Myosin inhibitors for treatment of hypertrophic cardiomyopathy.用于治疗肥厚型心肌病的肌球蛋白抑制剂

Cochrane Database Syst Rev. 2025 Jun 23;6(6):CD016183. doi: 10.1002/14651858.CD016183.

Exploring the Effectiveness of Road Maintenance Interventions on IRI Value Using Crowdsourced Connected Vehicle Data.利用众包联网车辆数据探索道路养护干预措施对国际粗糙度指数（IRI）值的有效性

Sensors (Basel). 2025 May 14;25(10):3091. doi: 10.3390/s25103091.

Longitudinal associations between BMI, ideal-actual BMI gap, and body shape concern among young Chinese females.中国年轻女性中体重指数（BMI）、理想-实际BMI差距与身体形态担忧之间的纵向关联。

Front Public Health. 2025 May 8;13:1549695. doi: 10.3389/fpubh.2025.1549695. eCollection 2025.

Age-specific determinants of reduced exercise capacity in youth after heart transplant: A longitudinal cohort study.心脏移植后青年运动能力下降的年龄特异性决定因素：一项纵向队列研究。

JHLT Open. 2024 Feb 19;4:100075. doi: 10.1016/j.jhlto.2024.100075. eCollection 2024 May.

Bifidobacterium longum and microbiome maturation modify a nutrient intervention for stunting in Zimbabwean infants.长双歧杆菌和微生物组成熟可改变针对津巴布韦婴幼儿发育迟缓的营养干预措施。

EBioMedicine. 2024 Oct;108:105362. doi: 10.1016/j.ebiom.2024.105362. Epub 2024 Sep 27.

CXCL10/IgG1 Axis in Multiple Sclerosis as a Potential Predictive Biomarker of Disease Activity.CXCL10/IgG1 轴在多发性硬化症中作为疾病活动的潜在预测生物标志物。

Neurol Neuroimmunol Neuroinflamm. 2024 Mar;11(2):e200200. doi: 10.1212/NXI.0000000000200200. Epub 2024 Feb 12.

Positive parenting moderates associations between childhood stress and corticolimbic structure.积极的育儿方式可调节童年压力与皮质边缘结构之间的关联。

PNAS Nexus. 2023 Jun 13;2(6):pgad145. doi: 10.1093/pnasnexus/pgad145. eCollection 2023 Jun.

Cholesteryl ester transfer protein (CETP) as a drug target for cardiovascular disease.胆固醇酯转移蛋白（CETP）作为心血管疾病的药物靶点。

Nat Commun. 2021 Sep 24;12(1):5640. doi: 10.1038/s41467-021-25703-3.

Coexposure to Inhaled Aldehydes or Carbon Dioxide Enhances the Carcinogenic Properties of the Tobacco-Specific Nitrosamine 4-Methylnitrosamino-1-(3-pyridyl)-1-butanone in the A/J Mouse Lung.吸入性醛类或二氧化碳共暴露增强了烟草特异性亚硝胺 4-甲基亚硝胺-1-(3-吡啶基)-1-丁酮在 A/J 小鼠肺部的致癌特性。

Chem Res Toxicol. 2021 Mar 15;34(3):723-732. doi: 10.1021/acs.chemrestox.0c00350. Epub 2021 Feb 25.

Effectiveness by gender and age of renin-angiotensin system blockade in heart failure-A national register-based cohort study.性别和年龄对心力衰竭肾素-血管紧张素系统阻断的疗效：一项基于国家登记的队列研究。

Pharmacoepidemiol Drug Saf. 2020 May;29(5):518-529. doi: 10.1002/pds.4958. Epub 2020 Feb 17.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

在小样本中探索交互作用会增加假阳性和假阴性发现的发生率：系统评价和模拟研究的结果。

Exploring interaction effects in small samples increases rates of false-positive and false-negative findings: results from a systematic review and simulation study.

机构信息

出版信息

OBJECTIVE

METHODS

RESULTS

CONCLUSION

目的

方法

结果

结论

相似文献

引用本文的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献