Department of Biostatistics, University of Iowa, 145 N Riverside Dr., Iowa City, IA 52242, USA.
Genes (Basel). 2023 Jan 13;14(1):211. doi: 10.3390/genes14010211.
The summary-data-based Mendelian randomization (SMR) method is gaining popularity in estimating the causal effect of an exposure on an outcome. In practice, the instrument SNP is often selected from the genome-wide association study (GWAS) on the exposure but no correction is made for such selection in downstream analysis, leading to a biased estimate of the effect size and invalid inference. We address this issue by using the likelihood derived from the sampling distribution of the estimated SNP effects in the exposure GWAS and the outcome GWAS. This likelihood takes into account how the instrument SNPs are selected. Since the effective sample size is 1, the asymptotic theory does not apply. We use a support for a profile likelihood as an interval estimate of the causal effect. Simulation studies indicate that this support has robust coverage while the confidence interval implied by the SMR method has lower-than-nominal coverage. Furthermore, the variance of the two-stage least squares estimate of the causal effect is shown to be the same as the variance used for SMR for one-sample data when there is no selection.
基于汇总数据的孟德尔随机化(SMR)方法在估计暴露对结局的因果效应方面越来越受欢迎。在实践中,工具 SNP 通常是从暴露的全基因组关联研究(GWAS)中选择的,但在下游分析中没有对这种选择进行校正,导致效应大小的估计偏倚和无效推断。我们通过使用来自暴露 GWAS 和结局 GWAS 中估计的 SNP 效应的抽样分布的似然来解决这个问题。这个似然考虑了工具 SNPs 是如何被选择的。由于有效样本量为 1,渐近理论不适用。我们使用支持似然的置信区间作为因果效应的区间估计。模拟研究表明,该置信区间具有稳健的覆盖范围,而 SMR 方法隐含的置信区间的覆盖范围低于名义覆盖范围。此外,当不存在选择时,两阶段最小二乘法估计因果效应的方差与用于单一样本数据的 SMR 的方差相同。