Brion Marie-Jo A, Shakhbazov Konstantin, Visscher Peter M
Broad Institute of MIT & Harvard, Cambridge, MA USA, MRC Centre for Causal Analyses in Translational Epidemiology, School of Social and Community Medicine, University of Bristol, Bristol, UK, Queensland Brain Institute, QLD, Australia and University of Queensland Diamantina Institute, University of Queensland, Brisbane, QLD, Australia.
Int J Epidemiol. 2013 Oct;42(5):1497-501. doi: 10.1093/ije/dyt179.
In Mendelian randomization (MR) studies, where genetic variants are used as proxy measures for an exposure trait of interest, obtaining adequate statistical power is frequently a concern due to the small amount of variation in a phenotypic trait that is typically explained by genetic variants. A range of power estimates based on simulations and specific parameters for two-stage least squares (2SLS) MR analyses based on continuous variables has previously been published. However there are presently no specific equations or software tools one can implement for calculating power of a given MR study. Using asymptotic theory, we show that in the case of continuous variables and a single instrument, for example a single-nucleotide polymorphism (SNP) or multiple SNP predictor, statistical power for a fixed sample size is a function of two parameters: the proportion of variation in the exposure variable explained by the genetic predictor and the true causal association between the exposure and outcome variable. We demonstrate that power for 2SLS MR can be derived using the non-centrality parameter (NCP) of the statistical test that is employed to test whether the 2SLS regression coefficient is zero. We show that the previously published power estimates from simulations can be represented theoretically using this NCP-based approach, with similar estimates observed when the simulation-based estimates are compared with our NCP-based approach. General equations for calculating statistical power for 2SLS MR using the NCP are provided in this note, and we implement the calculations in a web-based application.
在孟德尔随机化(MR)研究中,基因变异被用作感兴趣的暴露性状的替代指标,由于基因变异通常只能解释表型性状中少量的变异,因此获得足够的统计效力常常是一个令人担忧的问题。此前已经发表了一系列基于模拟以及基于连续变量的两阶段最小二乘法(2SLS)MR分析的特定参数的效力估计值。然而,目前还没有可以用于计算给定MR研究效力的具体方程或软件工具。利用渐近理论,我们表明,在连续变量和单一工具(例如单核苷酸多态性(SNP)或多个SNP预测因子)的情况下,固定样本量的统计效力是两个参数的函数:基因预测因子所解释的暴露变量变异比例以及暴露与结局变量之间的真实因果关联。我们证明,可以使用用于检验2SLS回归系数是否为零的统计检验的非中心参数(NCP)来推导2SLS MR的效力。我们表明,以前发表的基于模拟的效力估计值可以用这种基于NCP的方法从理论上表示,将基于模拟的估计值与我们基于NCP的方法进行比较时,观察到了相似的估计值。本注释提供了使用NCP计算2SLS MR统计效力的通用方程,并且我们在一个基于网络的应用程序中实现了这些计算。