Li Ang, Lin Tian, Walker Alicia, Tan Xiao, Zhao Ruolan, Yao Shuyang, Sullivan Patrick F, Hjerling-Leffler Jens, Wray Naomi R, Zeng Jian
Institute for Molecular Bioscience, University of Queensland, Brisbane, QLD, Australia.
Department of Psychiatry, University of Oxford, Oxford, UK.
medRxiv. 2025 Jun 5:2025.05.24.25328275. doi: 10.1101/2025.05.24.25328275.
Genome-wide association studies (GWAS) have discovered numerous trait-associated variants, but their biological context remains unclear. Integrating GWAS summary statistics with single-cell RNA-sequencing expression profiles can help identify the cell types in which these variants influence traits. Two main strategies have been developed to integrate these data types. The "single cell to GWAS" strategy (representing most methods) identifies gene sets with cell-type-specific expression and then follows with enrichment analyses applied to GWAS summary statistics. Conversely, the "GWAS to single cell" strategy begins with a list of trait-associated genes and calculates a cumulative disease score per cell based on gene expression count data. We systematically evaluated 19 approaches verses "ground truth" trait-cell type pairs to assess their statistical power and false positive rates. Based on these analyses, we draw seven key conclusions to guide future studies. We also propose a Cauchy approach to combine the two main strategies to maximize power for detecting trait-cell type associations.
全基因组关联研究(GWAS)已经发现了众多与性状相关的变异,但它们的生物学背景仍不清楚。将GWAS汇总统计数据与单细胞RNA测序表达谱相结合,有助于确定这些变异影响性状的细胞类型。目前已经开发出两种主要策略来整合这些数据类型。“单细胞到GWAS”策略(代表了大多数方法)识别具有细胞类型特异性表达的基因集,然后对GWAS汇总统计数据进行富集分析。相反,“GWAS到单细胞”策略从与性状相关的基因列表开始,并根据基因表达计数数据计算每个细胞的累积疾病评分。我们系统地评估了19种方法与“真实”性状-细胞类型对,以评估它们的统计效力和假阳性率。基于这些分析,我们得出七个关键结论以指导未来的研究。我们还提出了一种柯西方法来结合这两种主要策略,以最大限度地提高检测性状-细胞类型关联的效力。