Zhou Yichao, Adeluwa Temidayo, Zhu Lisha, Salazar-Magaña Sofia, Sumner Sarah, Kim Hyunki, Gona Saideep, Nyasimi Festus, Kulkarni Rohit, Powell Joseph E, Madduri Ravi, Liu Boxiang, Chen Mengjie, Im Hae Kyung
Committee of Genetic, Genomics, and Systems Biology, University of Chicago, Chicago, IL 60637, USA.
Section of Genetic Medicine, Department of Medicine, University of Chicago, Chicago, IL 60637, USA.
Cell Genom. 2025 May 14;5(5):100875. doi: 10.1016/j.xgen.2025.100875.
Transcriptome-wide association studies (TWASs) help identify disease-causing genes but often fail to pinpoint disease mechanisms at the cellular level because of the limited sample sizes and sparsity of cell-type-specific expression data. Here, we propose scPrediXcan, which integrates state-of-the-art deep learning approaches that predict epigenetic features from DNA sequences with the canonical TWAS framework. Our prediction approach, ctPred, predicts cell-type-specific expression with high accuracy and captures complex gene-regulatory grammar that linear models overlook. Applied to type 2 diabetes (T2D) and systemic lupus erythematosus (SLE), scPrediXcan outperformed the canonical TWAS framework by identifying more candidate causal genes, explaining more genome-wide association study (GWAS) loci and providing insights into the cellular specificity of TWAS hits. Overall, our results demonstrate that scPrediXcan represents a significant advance, promising to deepen our understanding of the cellular mechanisms underlying complex diseases.
全转录组关联研究(TWAS)有助于识别致病基因,但由于样本量有限以及细胞类型特异性表达数据的稀疏性,往往无法在细胞水平上精准确定疾病机制。在此,我们提出了scPrediXcan,它将从DNA序列预测表观遗传特征的先进深度学习方法与经典的TWAS框架相结合。我们的预测方法ctPred能够高精度地预测细胞类型特异性表达,并捕捉线性模型所忽略的复杂基因调控规律。应用于2型糖尿病(T2D)和系统性红斑狼疮(SLE)时,scPrediXcan通过识别更多候选因果基因、解释更多全基因组关联研究(GWAS)位点并深入了解TWAS命中的细胞特异性,优于经典的TWAS框架。总体而言,我们的结果表明scPrediXcan代表了一项重大进展,有望加深我们对复杂疾病潜在细胞机制的理解。