Department of Biostatistics, University of Michigan, Ann Arbor, Michigan, USA.
Department of Medicine, Divisions of Cardiology and Neurology, Duke University Medical Center, Durham, North Carolina, USA.
Genet Epidemiol. 2023 Mar;47(2):167-184. doi: 10.1002/gepi.22510. Epub 2022 Dec 8.
Mediation hypothesis testing for a large number of mediators is challenging due to the composite structure of the null hypothesis, ( : effect of the exposure on the mediator after adjusting for confounders; : effect of the mediator on the outcome after adjusting for exposure and confounders). In this paper, we reviewed three classes of methods for large-scale one at a time mediation hypothesis testing. These methods are commonly used for continuous outcomes and continuous mediators assuming there is no exposure-mediator interaction so that the product has a causal interpretation as the indirect effect. The first class of methods ignores the impact of different structures under the composite null hypothesis, namely, (1) ; (2) ; and (3) . The second class of methods weights the reference distribution under each case of the null to form a mixture reference distribution. The third class constructs a composite test statistic using the three p values obtained under each case of the null so that the reference distribution of the composite statistic is approximately . In addition to these existing methods, we developed the Sobel-comp method belonging to the second class, which uses a corrected mixture reference distribution for Sobel's test statistic. We performed extensive simulation studies to compare all six methods belonging to these three classes in terms of the false positive rates (FPRs) under the null hypothesis and the true positive rates under the alternative hypothesis. We found that the second class of methods which uses a mixture reference distribution could best maintain the FPRs at the nominal level under the null hypothesis and had the greatest true positive rates under the alternative hypothesis. We applied all methods to study the mediation mechanism of DNA methylation sites in the pathway from adult socioeconomic status to glycated hemoglobin level using data from the Multi-Ethnic Study of Atherosclerosis (MESA). We provide guidelines for choosing the optimal mediation hypothesis testing method in practice and develop an R package medScan available on the CRAN for implementing all the six methods.
由于零假设的复合结构,对大量中介的中介假设进行检验具有挑战性,(:暴露对调整混杂因素后的中介的影响;:调整暴露和混杂因素后,中介对结果的影响)。在本文中,我们回顾了大规模逐一中介假设检验的三类方法。这些方法通常用于连续结果和连续中介,前提是不存在暴露-中介相互作用,以便乘积具有因果解释,即间接效应。第一类方法忽略了复合零假设下不同结构的影响,即(1);(2);和(3)。第二类方法对每种零假设情况下的参考分布进行加权,形成混合参考分布。第三类方法使用每种零假设情况下获得的三个 p 值构建复合检验统计量,使得复合统计量的参考分布近似于。除了这些现有方法外,我们还开发了属于第二类的 Sobel-comp 方法,该方法使用 Sobel 检验统计量的校正混合参考分布。我们进行了广泛的模拟研究,比较了这三类中属于这六类方法的假阳性率(FPR)在零假设下和替代假设下的真阳性率。我们发现,使用混合参考分布的第二类方法可以在零假设下最好地保持 FPR 在名义水平,并在替代假设下具有最大的真阳性率。我们应用所有方法来研究从成人社会经济地位到糖化血红蛋白水平的途径中 DNA 甲基化位点的中介机制,使用来自动脉粥样硬化多民族研究(MESA)的数据。我们为在实践中选择最佳中介假设检验方法提供了指导,并为实施所有六种方法开发了一个可在 CRAN 上获得的 R 包 medScan。