Cao Xuewei, Sun Haochen, Feng Ru, Mazumder Rahul, Buen Abad Najar Carlos F, Li Yang I, de Jager Philip L, Bennett David, Dey Kushal K, Wang Gao
Center for Statistical Genetics, The Gertrude H. Sergievsky Center, Columbia University, New York, NY, USA.
Computational and Systems Biology, Sloan Kettering Institute, Memorial Sloan Kettering Cancer Center, New York, NY, USA.
medRxiv. 2025 May 6:2025.04.17.25326042. doi: 10.1101/2025.04.17.25326042.
Multi-trait QTL (xQTL) colocalization has shown great promises in identifying causal variants with shared genetic etiology across multiple molecular modalities, contexts, and complex diseases. However, the lack of scalable and efficient methods to integrate large-scale multi-omics data limits deeper insights into xQTL regulation. Here, we propose , a multi-task learning colocalization method that can scale to hundreds of traits, while accounting for multiple causal variants within a genomic region of interest. employs a specialized gradient boosting framework that can adaptively couple colocalized traits while performing causal variant selection, thereby enhancing the detection of weaker shared signals compared to existing pairwise and multi-trait colocalization methods. We applied genome-wide to 17 gene-level single-nucleus and bulk xQTL data from the aging brain cortex of ROSMAP individuals (average ), encompassing 6 cell types, 3 brain regions and 3 molecular modalities (expression, splicing, and protein abundance). Across molecular xQTLs, identified 16,503 distinct colocalization events, exhibiting 10.7(±0.74)-fold enrichment for heritability across 57 complex diseases/traits and showing strong concordance with element-gene pairs validated by CRISPR screening assays. When colocalized against Alzheimer's disease (AD) GWAS, identified up to 2.5-fold more distinct colocalized loci, explaining twice the AD disease heritability compared to fine-mapping without xQTL integration. This improvement is largely attributable to 's enhanced sensitivity in detecting gene-distal colocalizations, as supported by strong concordance with known enhancer-gene links, highlighting its ability to identify biologically plausible AD susceptibility loci with underlying regulatory mechanisms. Notably, several genes including and showed sub-threshold associations in GWAS, but were identified through multi-omics colocalizations which provide new functional support for their involvement in AD pathogenesis.
多性状QTL(xQTL)共定位在识别跨多种分子模式、背景和复杂疾病具有共同遗传病因的因果变异方面显示出巨大潜力。然而,缺乏可扩展且高效的方法来整合大规模多组学数据限制了对xQTL调控的更深入洞察。在此,我们提出了一种多任务学习共定位方法,该方法可以扩展到数百个性状,同时考虑感兴趣基因组区域内的多个因果变异。该方法采用了一种专门的梯度提升框架,在进行因果变异选择时能够自适应地将共定位性状耦合在一起,从而与现有的成对和多性状共定位方法相比,增强了对较弱共享信号的检测。我们将该方法应用于ROSMAP个体衰老大脑皮层的17个基因水平单核和批量xQTL数据的全基因组分析(平均[此处原文缺失具体数值]),涵盖6种细胞类型、3个脑区和3种分子模式(表达、剪接和蛋白质丰度)。在分子xQTL中,该方法识别出16503个不同的共定位事件,在57种复杂疾病/性状中遗传力富集了10.7(±0.74)倍,并且与通过CRISPR筛选试验验证的元件 - 基因对显示出高度一致性。当与阿尔茨海默病(AD)全基因组关联研究(GWAS)进行共定位时,该方法识别出的不同共定位位点多达2.5倍,与未进行xQTL整合的精细定位相比,解释的AD疾病遗传力高出两倍。这种改进很大程度上归因于该方法在检测基因远端共定位方面增强的灵敏度,这得到了与已知增强子 - 基因联系的高度一致性的支持,突出了其识别具有潜在调控机制的生物学上合理的AD易感位点的能力。值得注意的是,包括[此处原文缺失具体基因名称]和[此处原文缺失具体基因名称]在内的几个基因在GWAS中显示出亚阈值关联,但通过多组学共定位被识别出来,这为它们参与AD发病机制提供了新的功能支持。