Department of Statistics, University of Michigan, Ann Arbor, Michigan, USA.
Department of Biostatistics, University of Michigan, Ann Arbor, Michigan, USA.
Biometrics. 2023 Sep;79(3):1801-1813. doi: 10.1111/biom.13740. Epub 2022 Aug 31.
Integrative analyses based on statistically relevant associations between genomics and a wealth of intermediary phenotypes (such as imaging) provide vital insights into their clinical relevance in terms of the disease mechanisms. Estimates for uncertainty in the resulting integrative models are however unreliable unless inference accounts for the selection of these associations with accuracy. In this paper, we develop selection-aware Bayesian methods, which (1) counteract the impact of model selection bias through a "selection-aware posterior" in a flexible class of integrative Bayesian models post a selection of promising variables via ℓ -regularized algorithms; (2) strike an inevitable trade-off between the quality of model selection and inferential power when the same data set is used for both selection and uncertainty estimation. Central to our methodological development, a carefully constructed conditional likelihood function deployed with a reparameterization mapping provides tractable updates when gradient-based Markov chain Monte Carlo (MCMC) sampling is used for estimating uncertainties from the selection-aware posterior. Applying our methods to a radiogenomic analysis, we successfully recover several important gene pathways and estimate uncertainties for their associations with patient survival times.
基于基因组学和大量中介表型(如影像学)之间具有统计学意义的关联进行综合分析,可以深入了解它们在疾病机制方面的临床相关性。然而,如果推理不能准确地考虑到这些关联的选择,那么对综合模型中产生的不确定性的估计将是不可靠的。在本文中,我们开发了具有选择意识的贝叶斯方法,这些方法(1)通过在灵活的综合贝叶斯模型类中使用“具有选择意识的后验”来对抗模型选择偏差的影响,该模型通过ℓ-正则化算法选择有前途的变量后进行;(2)当同一数据集既用于选择又用于不确定性估计时,在模型选择质量和推理能力之间必然存在权衡。我们的方法开发的核心是一个精心构建的条件似然函数,它与一个重新参数化映射一起使用,当使用基于梯度的马尔可夫链蒙特卡罗(MCMC)采样从具有选择意识的后验中估计不确定性时,它提供了可处理的更新。将我们的方法应用于放射基因组学分析,我们成功地恢复了几个重要的基因途径,并估计了它们与患者生存时间的关联的不确定性。