Melton Hunter J, Zhang Zichen, Wu Chong
Department of Statistics, Florida State University, Tallahassee, FL, USA.
Department of Biostatistics, The University of Texas MD Anderson Cancer Center, Houston, TX, USA.
medRxiv. 2023 Feb 6:2023.02.02.23285208. doi: 10.1101/2023.02.02.23285208.
Transcriptome-wide association studies (TWAS) integrate gene expression prediction models and genome-wide association studies (GWAS) to identify gene-trait associations. The power of TWAS is determined by the sample size of GWAS and the accuracy of the expression prediction model. Here, we present a new method, the Summary-level Unified Method for Modeling Integrated Transcriptome using Functional Annotations (SUMMIT-FA), that improves the accuracy of gene expression prediction by leveraging functional annotation resources and a large expression quantitative trait loci (eQTL) summary-level dataset. We build gene expression prediction models using SUMMIT-FA with a comprehensive functional database MACIE and the eQTL summary-level data from the eQTLGen consortium. By applying the resulting models to GWASs for 24 complex traits and exploring it through a simulation study, we show that SUMMIT-FA improves the accuracy of gene expression prediction models in whole blood, identifies significantly more gene-trait associations, and improves predictive power for identifying "silver standard" genes compared to several benchmark methods.
全转录组关联研究(TWAS)整合了基因表达预测模型和全基因组关联研究(GWAS),以识别基因与性状之间的关联。TWAS的效能由GWAS的样本量和表达预测模型的准确性决定。在此,我们提出了一种新方法,即利用功能注释对整合转录组进行建模的汇总水平统一方法(SUMMIT-FA),该方法通过利用功能注释资源和一个大型表达定量性状位点(eQTL)汇总水平数据集来提高基因表达预测的准确性。我们使用SUMMIT-FA、一个综合功能数据库MACIE以及来自eQTLGen联盟的eQTL汇总水平数据构建基因表达预测模型。通过将所得模型应用于24种复杂性状的GWAS,并通过模拟研究进行探索,我们表明,与几种基准方法相比,SUMMIT-FA提高了全血中基因表达预测模型的准确性,识别出显著更多的基因与性状关联,并提高了识别“金标准”基因的预测能力。