Department of Biostatistics, The University of Texas MD Anderson Cancer Center, Houston, TX, USA.
CSIRO, Royal Brisbane and Women's Hospital, Brisbane, Australia.
Bioinformatics. 2023 Jan 1;39(1). doi: 10.1093/bioinformatics/btad023.
Multilevel molecular profiling of tumors and the integrative analysis with clinical outcomes have enabled a deeper characterization of cancer treatment. Mediation analysis has emerged as a promising statistical tool to identify and quantify the intermediate mechanisms by which a gene affects an outcome. However, existing methods lack a unified approach to handle various types of outcome variables, making them unsuitable for high-throughput molecular profiling data with highly interconnected variables.
We develop a general mediation analysis framework for proteogenomic data that include multiple exposures, multivariate mediators on various scales of effects as appropriate for continuous, binary and survival outcomes. Our estimation method avoids imposing constraints on model parameters such as the rare disease assumption, while accommodating multiple exposures and high-dimensional mediators. We compare our approach to other methods in extensive simulation studies at a range of sample sizes, disease prevalence and number of false mediators. Using kidney renal clear cell carcinoma proteogenomic data, we identify genes that are mediated by proteins and the underlying mechanisms on various survival outcomes that capture short- and long-term disease-specific clinical characteristics.
Software is made available in an R package (https://github.com/longjp/mediateR).
Supplementary data are available at Bioinformatics online.
对肿瘤进行多层次分子分析,并与临床结果进行综合分析,使我们能够更深入地了解癌症治疗。中介分析已成为一种很有前途的统计工具,可以识别和量化基因影响结果的中间机制。然而,现有的方法缺乏一种统一的方法来处理各种类型的结果变量,因此不适合具有高度相互关联变量的高通量分子分析数据。
我们为包含多个暴露、各种效应尺度上的多变量中介的蛋白质基因组数据开发了一个通用的中介分析框架,这些中介适合连续、二分类和生存结果。我们的估计方法避免对模型参数施加限制,如罕见病假设,同时适用于多个暴露和高维中介。我们在一系列样本量、疾病流行率和假中介数量的广泛模拟研究中,将我们的方法与其他方法进行了比较。使用肾透明细胞癌蛋白质基因组数据,我们确定了受蛋白质介导的基因,以及各种生存结果中的潜在机制,这些结果捕捉了短期和长期的疾病特异性临床特征。
软件在 R 包(https://github.com/longjp/mediateR)中提供。
补充数据可在生物信息学在线获得。