Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, UK.
Open Targets, Wellcome Genome Campus, Hinxton, UK.
Nat Genet. 2021 Nov;53(11):1527-1533. doi: 10.1038/s41588-021-00945-5. Epub 2021 Oct 28.
Genome-wide association studies (GWASs) have identified many variants associated with complex traits, but identifying the causal gene(s) is a major challenge. In the present study, we present an open resource that provides systematic fine mapping and gene prioritization across 133,441 published human GWAS loci. We integrate genetics (GWAS Catalog and UK Biobank) with transcriptomic, proteomic and epigenomic data, including systematic disease-disease and disease-molecular trait colocalization results across 92 cell types and tissues. We identify 729 loci fine mapped to a single-coding causal variant and colocalized with a single gene. We trained a machine-learning model using the fine-mapped genetics and functional genomics data and 445 gold-standard curated GWAS loci to distinguish causal genes from neighboring genes, outperforming a naive distance-based model. Our prioritized genes were enriched for known approved drug targets (odds ratio = 8.1, 95% confidence interval = 5.7, 11.5). These results are publicly available through a web portal ( http://genetics.opentargets.org ), enabling users to easily prioritize genes at disease-associated loci and assess their potential as drug targets.
全基因组关联研究(GWAS)已经确定了许多与复杂性状相关的变异,但确定因果基因是一个主要挑战。在本研究中,我们提供了一个开放资源,系统地对 133441 个人类 GWAS 基因座进行精细定位和基因优先级排序。我们将遗传学(GWAS Catalog 和 UK Biobank)与转录组学、蛋白质组学和表观基因组学数据相结合,包括 92 种细胞类型和组织中系统性疾病-疾病和疾病-分子特征的共定位结果。我们确定了 729 个精细定位到单个编码因果变异体并与单个基因共定位的基因座。我们使用精细映射的遗传学和功能基因组学数据以及 445 个黄金标准的精心整理的 GWAS 基因座对机器学习模型进行了训练,以区分因果基因和邻近基因,优于基于距离的简单模型。我们优先考虑的基因与已知的已批准药物靶点富集(优势比=8.1,95%置信区间=5.7,11.5)。这些结果通过一个网络门户(http://genetics.opentargets.org)公开提供,使用户能够轻松地对与疾病相关基因座的基因进行优先级排序,并评估它们作为药物靶点的潜力。