• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

主效应误设下的交互作用分析:一些常见错误及简单解决方案

Interaction analysis under misspecification of main effects: Some common mistakes and simple solutions.

作者信息

Zhang Min, Yu Youfei, Wang Shikun, Salvatore Maxwell, G Fritsche Lars, He Zihuai, Mukherjee Bhramar

机构信息

Department of Biostatistics, University of Michigan School of Public Health, Ann Arbor, Michigan.

Department of Biostatistics, MD Anderson Cancer Center, Houston, Texas.

出版信息

Stat Med. 2020 May 20;39(11):1675-1694. doi: 10.1002/sim.8505. Epub 2020 Feb 26.

DOI:10.1002/sim.8505
PMID:32101638
Abstract

The statistical practice of modeling interaction with two linear main effects and a product term is ubiquitous in the statistical and epidemiological literature. Most data modelers are aware that the misspecification of main effects can potentially cause severe type I error inflation in tests for interactions, leading to spurious detection of interactions. However, modeling practice has not changed. In this article, we focus on the specific situation where the main effects in the model are misspecified as linear terms and characterize its impact on common tests for statistical interaction. We then propose some simple alternatives that fix the issue of potential type I error inflation in testing interaction due to main effect misspecification. We show that when using the sandwich variance estimator for a linear regression model with a quantitative outcome and two independent factors, both the Wald and score tests asymptotically maintain the correct type I error rate. However, if the independence assumption does not hold or the outcome is binary, using the sandwich estimator does not fix the problem. We further demonstrate that flexibly modeling the main effect under a generalized additive model can largely reduce or often remove bias in the estimates and maintain the correct type I error rate for both quantitative and binary outcomes regardless of the independence assumption. We show, under the independence assumption and for a continuous outcome, overfitting and flexibly modeling the main effects does not lead to power loss asymptotically relative to a correctly specified main effect model. Our simulation study further demonstrates the empirical fact that using flexible models for the main effects does not result in a significant loss of power for testing interaction in general. Our results provide an improved understanding of the strengths and limitations for tests of interaction in the presence of main effect misspecification. Using data from a large biobank study "The Michigan Genomics Initiative", we present two examples of interaction analysis in support of our results.

摘要

在统计和流行病学文献中,使用两个线性主效应和一个乘积项来对交互作用进行建模的统计方法极为常见。大多数数据建模者都知道,主效应的错误设定可能会在交互作用检验中导致严重的I型错误膨胀,从而导致交互作用的虚假检测。然而,建模实践并没有改变。在本文中,我们关注模型中的主效应被错误设定为线性项的具体情况,并描述其对统计交互作用常用检验的影响。然后,我们提出了一些简单的替代方法,以解决由于主效应错误设定而在检验交互作用时潜在的I型错误膨胀问题。我们表明,当对具有定量结果和两个独立因素的线性回归模型使用三明治方差估计量时,Wald检验和得分检验在渐近意义上都能保持正确的I型错误率。然而,如果独立性假设不成立或结果是二元的,使用三明治估计量并不能解决问题。我们进一步证明,在广义相加模型下灵活地对主效应进行建模,可以在很大程度上减少或常常消除估计中的偏差,并且无论独立性假设如何,对于定量和二元结果都能保持正确的I型错误率。我们表明,在独立性假设下且对于连续结果,相对于正确设定的主效应模型,过度拟合和灵活地对主效应进行建模在渐近意义上不会导致检验效能的损失。我们的模拟研究进一步证明了一个经验事实,即一般来说,使用灵活的主效应模型不会导致检验交互作用时的效能显著损失。我们的结果有助于更好地理解在存在主效应错误设定的情况下交互作用检验的优势和局限性。利用来自大型生物样本库研究“密歇根基因组计划”的数据,我们给出了两个交互作用分析的例子来支持我们的结果。

相似文献

1
Interaction analysis under misspecification of main effects: Some common mistakes and simple solutions.主效应误设下的交互作用分析:一些常见错误及简单解决方案
Stat Med. 2020 May 20;39(11):1675-1694. doi: 10.1002/sim.8505. Epub 2020 Feb 26.
2
Testing for gene-environment interaction under exposure misspecification.暴露误判情况下的基因-环境相互作用检测。
Biometrics. 2018 Jun;74(2):653-662. doi: 10.1111/biom.12813. Epub 2017 Nov 9.
3
Robust Tests for Additive Gene-Environment Interaction in Case-Control Studies Using Gene-Environment Independence.基于基因-环境独立性的病例对照研究中加性基因-环境交互作用的稳健检验
Am J Epidemiol. 2018 Feb 1;187(2):366-377. doi: 10.1093/aje/kwx243.
4
Type I and Type II error under random-effects misspecification in generalized linear mixed models.广义线性混合模型中随机效应设定错误下的I型和II型错误
Biometrics. 2007 Dec;63(4):1038-44. doi: 10.1111/j.1541-0420.2007.00782.x. Epub 2007 Apr 9.
5
The impact of covariance misspecification in multivariate Gaussian mixtures on estimation and inference: an application to longitudinal modeling.多元高斯混合模型中协方差误设定对估计和推断的影响:在纵向建模中的应用。
Stat Med. 2013 Jul 20;32(16):2790-803. doi: 10.1002/sim.5729. Epub 2013 Jan 7.
6
Subgroup analyses in randomised controlled trials: quantifying the risks of false-positives and false-negatives.随机对照试验中的亚组分析:量化假阳性和假阴性风险
Health Technol Assess. 2001;5(33):1-56. doi: 10.3310/hta5330.
7
Choice of parametric accelerated life and proportional hazards models for survival data: asymptotic results.生存数据的参数加速寿命模型和比例风险模型的选择:渐近结果
Lifetime Data Anal. 2002 Dec;8(4):375-93. doi: 10.1023/a:1020570922072.
8
Misspecification of confounder-exposure and confounder-outcome associations leads to bias in effect estimates.混杂因素-暴露和混杂因素-结局关联的错误指定会导致效应估计偏倚。
BMC Med Res Methodol. 2023 Jan 12;23(1):11. doi: 10.1186/s12874-022-01817-0.
9
Improved hypothesis testing for coefficients in generalized estimating equations with small samples of clusters.针对聚类小样本广义估计方程中系数的改进假设检验。
Stat Med. 2006 Dec 15;25(23):4081-98. doi: 10.1002/sim.2502.
10
Small sample performance of bias-corrected sandwich estimators for cluster-randomized trials with binary outcomes.具有二元结局的整群随机试验的偏差校正三明治估计量的小样本性能。
Stat Med. 2015 Jan 30;34(2):281-96. doi: 10.1002/sim.6344. Epub 2014 Oct 24.

引用本文的文献

1
Adjuvant nivolumab in muscle-invasive urothelial carcinoma: exploratory biomarker analysis of the randomized phase 3 CheckMate 274 trial.辅助性纳武利尤单抗治疗肌层浸润性尿路上皮癌:随机3期CheckMate 274试验的探索性生物标志物分析
Nat Med. 2025 Aug 7. doi: 10.1038/s41591-025-03802-8.
2
Identification of Pancreatic Cancer Germline Risk Variants With Effects That Are Modified by Smoking.鉴定受吸烟影响的胰腺癌种系风险变异
JCO Precis Oncol. 2024 Mar;8:e2300355. doi: 10.1200/PO.23.00355.
3
A robust and adaptive framework for interaction testing in quantitative traits between multiple genetic loci and exposure variables.
用于多遗传位点与暴露变量间数量性状相互作用检测的稳健自适应框架。
PLoS Genet. 2022 Nov 16;18(11):e1010464. doi: 10.1371/journal.pgen.1010464. eCollection 2022 Nov.
4
A hierarchical integrative group least absolute shrinkage and selection operator for analyzing environmental mixtures.一种用于分析环境混合物的分层综合组最小绝对收缩和选择算子
Environmetrics. 2021 Dec;32(8). doi: 10.1002/env.2698. Epub 2021 Jul 30.
5
GEM: scalable and flexible gene-environment interaction analysis in millions of samples.GEM:可扩展和灵活的基因-环境交互分析,适用于数百万个样本。
Bioinformatics. 2021 Oct 25;37(20):3514-3520. doi: 10.1093/bioinformatics/btab223.