Biostatistics Program, Public Health Science Division, Fred Hutchinson Cancer Center, Seattle, WA, USA.
Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA.
Nat Commun. 2023 May 25;14(1):3030. doi: 10.1038/s41467-023-38795-w.
Mapping cell type-specific gene expression quantitative trait loci (ct-eQTLs) is a powerful way to investigate the genetic basis of complex traits. A popular method for ct-eQTL mapping is to assess the interaction between the genotype of a genetic locus and the abundance of a specific cell type using a linear model. However, this approach requires transforming RNA-seq count data, which distorts the relation between gene expression and cell type proportions and results in reduced power and/or inflated type I error. To address this issue, we have developed a statistical method called CSeQTL that allows for ct-eQTL mapping using bulk RNA-seq count data while taking advantage of allele-specific expression. We validated the results of CSeQTL through simulations and real data analysis, comparing CSeQTL results to those obtained from purified bulk RNA-seq data or single cell RNA-seq data. Using our ct-eQTL findings, we were able to identify cell types relevant to 21 categories of human traits.
对细胞类型特异性基因表达数量性状基因座(ct-eQTL)进行映射是研究复杂性状遗传基础的一种有效方法。一种常用的 ct-eQTL 映射方法是使用线性模型评估遗传基因座的基因型与特定细胞类型丰度之间的相互作用。然而,这种方法需要转换 RNA-seq 计数数据,这会扭曲基因表达与细胞类型比例之间的关系,导致降低功效和/或增加 I 型错误。为了解决这个问题,我们开发了一种名为 CSeQTL 的统计方法,该方法允许使用批量 RNA-seq 计数数据进行 ct-eQTL 映射,同时利用等位基因特异性表达。我们通过模拟和真实数据分析验证了 CSeQTL 的结果,将 CSeQTL 结果与从纯化的批量 RNA-seq 数据或单细胞 RNA-seq 数据中获得的结果进行了比较。使用我们的 ct-eQTL 发现,我们能够识别与人类 21 个类别的特征相关的细胞类型。