Parrish Randy L, Buchman Aron S, Tasaki Shinya, Wang Yanling, Avey Denis, Xu Jishu, De Jager Philip L, Bennett David A, Epstein Michael P, Yang Jingjing
medRxiv. 2024 May 13:2023.06.20.23291605. doi: 10.1101/2023.06.20.23291605.
Multiple reference panels of a given tissue or multiple tissues often exist, and multiple regression methods could be used for training gene expression imputation models for TWAS. To leverage expression imputation models (i.e., base models) trained with multiple reference panels, regression methods, and tissues, we develop a Stacked Regression based TWAS (SR-TWAS) tool which can obtain optimal linear combinations of base models for a given validation transcriptomic dataset. Both simulation and real studies showed that SR-TWAS improved power, due to increased effective training sample sizes and borrowed strength across multiple regression methods and tissues. Leveraging base models across multiple reference panels, tissues, and regression methods, our real application studies identified 6 independent significant risk genes for Alzheimer's disease (AD) dementia for supplementary motor area tissue and 9 independent significant risk genes for Parkinson's disease (PD) for substantia nigra tissue. Relevant biological interpretations were found for these significant risk genes.
给定组织或多个组织的多个参考面板通常存在,并且可以使用多元回归方法来训练用于全转录组关联研究(TWAS)的基因表达插补模型。为了利用使用多个参考面板、回归方法和组织训练的表达插补模型(即基础模型),我们开发了一种基于堆叠回归的TWAS(SR-TWAS)工具,该工具可以为给定的验证转录组数据集获得基础模型的最佳线性组合。模拟和实际研究均表明,由于有效训练样本量增加以及跨多种回归方法和组织的信息借用,SR-TWAS提高了检验效能。通过利用跨多个参考面板、组织和回归方法的基础模型,我们的实际应用研究为辅助运动区组织鉴定出6个独立的阿尔茨海默病(AD)痴呆显著风险基因,为黑质组织鉴定出9个独立的帕金森病(PD)显著风险基因。并对这些显著风险基因进行了相关的生物学解释。