Biostatistics Branch, Division of Cancer Epidemiology and Genetics, National Cancer Institute, Rockville, Maryland, USA.
Metabolomics Epidemiology Branch, Division of Cancer Epidemiology and Genetics, National Cancer Institute, Rockville, Maryland, USA.
Stat Med. 2020 Aug 15;39(18):2423-2436. doi: 10.1002/sim.8546. Epub 2020 May 4.
We consider the scenario where there is an exposure, multiple biologically defined sets of biomarkers, and an outcome. We propose a new two-step procedure that tests if any of the sets of biomarkers mediate the exposure/outcome relationship, while maintaining a prespecified familywise error rate. The first step of the proposed procedure is a screening step that removes all groups that are unlikely to be strongly associated with both the exposure and the outcome. The second step adapts recent advances in postselection inference to test if there are true mediators in each of the remaining candidate sets. We use simulation to show that this simple two-step procedure has higher statistical power to detect true mediating sets when compared with existing procedures. We then use our two-step procedure to identify a set of Lysine-related metabolites that potentially mediate the known relationship between increased body mass index and the increased risk of estrogen-receptor positive breast cancer in postmenopausal women.
存在一种暴露、多个生物学定义的生物标志物集和一个结果。我们提出了一种新的两步程序,用于测试是否有任何一组生物标志物可以调节暴露与结果之间的关系,同时保持预先指定的总体错误率。该方法的第一步是筛选步骤,它会删除所有不太可能与暴露和结果都有强烈关联的组。第二步是利用最近在选择后推断方面的进展,来测试在每个剩余的候选组中是否存在真正的中介。我们通过模拟表明,与现有程序相比,这种简单的两步程序在检测真正的中介组时具有更高的统计功效。然后,我们使用两步程序来确定一组与赖氨酸相关的代谢物,这些代谢物可能在已知的身体质量指数增加与绝经后妇女雌激素受体阳性乳腺癌风险增加之间的关系中起中介作用。