Su Chang, Lee Dongsoo, Jin Peng, Zhang Jingfei
Department of Biostatistics and Bioinformatics, Emory University, Atlanta, GA, USA.
Department of Human Genetics, School of Medicine, Emory University, Atlanta, GA, USA.
Nat Commun. 2025 Apr 26;16(1):3941. doi: 10.1038/s41467-025-59306-z.
Mapping enhancers and target genes in disease-related cell types provides critical insights into the functional mechanisms of genome-wide association studies (GWAS) variants. Single-cell multimodal data, which measure gene expression and chromatin accessibility in the same cells, enable the cell-type-specific inference of enhancer-gene pairs. However, this task is challenged by high data sparsity, sequencing depth variation, and the computational burden of analyzing a large number of pairs. We introduce scMultiMap, a statistical method that infers enhancer-gene association from sparse multimodal counts using a joint latent-variable model. It adjusts for technical confounding, permits fast moment-based estimation and provides analytically derived p-values. In blood and brain data, scMultiMap shows appropriate type I error control, high statistical power, and computational efficiency (1% of existing methods). When applied to Alzheimer's disease (AD) data, scMultiMap gives the highest heritability enrichment in microglia and reveals insights into the regulatory mechanisms of AD GWAS variants.
在疾病相关细胞类型中映射增强子和靶基因,可为全基因组关联研究(GWAS)变体的功能机制提供关键见解。单细胞多模态数据可在同一细胞中测量基因表达和染色质可及性,从而能够对增强子-基因对进行细胞类型特异性推断。然而,这项任务面临着数据稀疏性高、测序深度变化以及分析大量配对的计算负担等挑战。我们引入了scMultiMap,这是一种统计方法,它使用联合潜变量模型从稀疏多模态计数中推断增强子-基因关联。它可调整技术混杂因素,允许基于矩的快速估计,并提供解析推导的p值。在血液和大脑数据中,scMultiMap显示出适当的I型错误控制、高统计功效和计算效率(为现有方法的1%)。当应用于阿尔茨海默病(AD)数据时,scMultiMap在小胶质细胞中给出了最高的遗传力富集,并揭示了AD GWAS变体的调控机制。