Center for Computational and Quantitative Genetics, Department of Human Genetics, Emory University School of Medicine, Atlanta, GA, 30322, USA.
Department of Biostatistics, Emory University School of Public Health, Atlanta, GA, 30322, USA.
Nat Commun. 2024 Aug 5;15(1):6646. doi: 10.1038/s41467-024-50983-w.
Multiple reference panels of a given tissue or multiple tissues often exist, and multiple regression methods could be used for training gene expression imputation models for transcriptome-wide association studies (TWAS). To leverage expression imputation models (i.e., base models) trained with multiple reference panels, regression methods, and tissues, we develop a Stacked Regression based TWAS (SR-TWAS) tool which can obtain optimal linear combinations of base models for a given validation transcriptomic dataset. Both simulation and real studies show that SR-TWAS improves power, due to increased training sample sizes and borrowed strength across multiple regression methods and tissues. Leveraging base models across multiple reference panels, tissues, and regression methods, our real studies identify 6 independent significant risk genes for Alzheimer's disease (AD) dementia for supplementary motor area tissue and 9 independent significant risk genes for Parkinson's disease (PD) for substantia nigra tissue. Relevant biological interpretations are found for these significant risk genes.
存在多种给定组织或多种组织的参考面板,并且可以使用多元回归方法来训练用于全转录组关联研究 (TWAS) 的基因表达估算模型。为了利用基于多个参考面板、回归方法和组织训练的表达估算模型(即基础模型),我们开发了一种基于堆叠回归的 TWAS (SR-TWAS) 工具,该工具可以为给定的验证转录组数据集获得基础模型的最优线性组合。模拟和真实研究均表明,由于增加了训练样本量,并在多个回归方法和组织之间借用了优势,SR-TWAS 可提高功效。通过在多个参考面板、组织和回归方法之间利用基础模型,我们的真实研究确定了额上运动区组织中用于阿尔茨海默病 (AD) 痴呆的 6 个独立显著风险基因,以及黑质组织中用于帕金森病 (PD) 的 9 个独立显著风险基因。为这些显著风险基因找到了相关的生物学解释。