利用监督学习对 cis-eQTL 进行功能信息精细映射，可确定另外 20,913 个假定的因果 eQTL。

Leveraging supervised learning for functionally informed fine-mapping of cis-eQTLs identifies an additional 20,913 putative causal eQTLs.

机构信息

Broad Institute of MIT and Harvard, Cambridge, MA, USA.

Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA.

出版信息

Nat Commun. 2021 Jun 7;12(1):3394. doi: 10.1038/s41467-021-23134-8.

DOI:10.1038/s41467-021-23134-8

PMID:34099641

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8184741/

Abstract

The large majority of variants identified by GWAS are non-coding, motivating detailed characterization of the function of non-coding variants. Experimental methods to assess variants' effect on gene expressions in native chromatin context via direct perturbation are low-throughput. Existing high-throughput computational predictors thus have lacked large gold standard sets of regulatory variants for training and validation. Here, we leverage a set of 14,807 putative causal eQTLs in humans obtained through statistical fine-mapping, and we use 6121 features to directly train a predictor of whether a variant modifies nearby gene expression. We call the resulting prediction the expression modifier score (EMS). We validate EMS by comparing its ability to prioritize functional variants with other major scores. We then use EMS as a prior for statistical fine-mapping of eQTLs to identify an additional 20,913 putatively causal eQTLs, and we incorporate EMS into co-localization analysis to identify 310 additional candidate genes across UK Biobank phenotypes.

摘要

大多数通过 GWAS 确定的变体是非编码的，这促使我们详细描述非编码变体的功能。通过直接干扰来评估变体在天然染色质环境中对基因表达影响的实验方法是低通量的。因此，现有的高通量计算预测器缺乏用于训练和验证的大型监管变体黄金标准集。在这里，我们利用通过统计精细映射获得的 14807 个人类潜在因果性 eQTL 集，并使用 6121 个特征直接训练变体是否改变附近基因表达的预测器。我们将得到的预测称为表达修饰得分 (EMS)。我们通过比较其优先考虑功能变体的能力与其他主要分数来验证 EMS。然后，我们将 EMS 用作 eQTL 统计精细映射的先验，以鉴定另外 20913 个潜在因果性 eQTL，并将 EMS 纳入共定位分析，以鉴定英国生物库表型中的 310 个额外候选基因。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4e88/8184741/266c927af39c/41467_2021_23134_Fig1_HTML.jpg

相似文献

Leveraging supervised learning for functionally informed fine-mapping of cis-eQTLs identifies an additional 20,913 putative causal eQTLs.利用监督学习对 cis-eQTL 进行功能信息精细映射，可确定另外 20,913 个假定的因果 eQTL。

Nat Commun. 2021 Jun 7;12(1):3394. doi: 10.1038/s41467-021-23134-8.

Leveraging allelic imbalance to refine fine-mapping for eQTL studies.利用等位基因不平衡来优化 eQTL 研究的精细映射。

PLoS Genet. 2019 Dec 13;15(12):e1008481. doi: 10.1371/journal.pgen.1008481. eCollection 2019 Dec.

Functionally informed fine-mapping and polygenic localization of complex trait heritability.功能信息指导的复杂性状遗传力精细映射和多基因定位。

Nat Genet. 2020 Dec;52(12):1355-1363. doi: 10.1038/s41588-020-00735-5. Epub 2020 Nov 16.

Illuminating links between cis-regulators and trans-acting variants in the human prefrontal cortex.揭示人类前额皮质中顺式调控因子和反式作用变异体之间的联系。

Genome Med. 2022 Nov 24;14(1):133. doi: 10.1186/s13073-022-01133-8.

Fine mapping of candidate effector genes for heart rate.心率候选效应基因的精细定位。

Hum Genet. 2024 Oct;143(9-10):1207-1221. doi: 10.1007/s00439-024-02684-z. Epub 2024 Jul 6.

Conditional entropy in variation-adjusted windows detects selection signatures associated with expression quantitative trait loci (eQTLs).变异调整窗口中的条件熵可检测与表达数量性状基因座（eQTL）相关的选择特征。

BMC Genomics. 2015;16 Suppl 8(Suppl 8):S8. doi: 10.1186/1471-2164-16-S8-S8. Epub 2015 Jun 18.

Large-scale East-Asian eQTL mapping reveals novel candidate genes for LD mapping and the genomic landscape of transcriptional effects of sequence variants.大规模东亚eQTL图谱揭示了用于连锁不平衡定位的新候选基因以及序列变异转录效应的基因组格局。

PLoS One. 2014 Jun 23;9(6):e100924. doi: 10.1371/journal.pone.0100924. eCollection 2014.

TAGOOS: genome-wide supervised learning of non-coding loci associated to complex phenotypes.TAGOOS：全基因组监督学习与复杂表型相关的非编码基因座。

Nucleic Acids Res. 2019 Aug 22;47(14):e79. doi: 10.1093/nar/gkz320.

Fine-mapping and cell-specific enrichment at corneal resistance factor loci prioritize candidate causal regulatory variants.角膜阻力因子基因座的精细映射和细胞特异性富集优先考虑候选因果调节变异。

Commun Biol. 2020 Dec 11;3(1):762. doi: 10.1038/s42003-020-01497-w.

Multiomic QTL mapping reveals phenotypic complexity of GWAS loci and prioritizes putative causal variants.多组学QTL定位揭示了全基因组关联研究（GWAS）位点的表型复杂性，并对潜在的因果变异进行了优先级排序。

Cell Genom. 2025 Mar 12;5(3):100775. doi: 10.1016/j.xgen.2025.100775. Epub 2025 Feb 21.

引用本文的文献

Language Modelling Techniques for Analysing the Impact of Human Genetic Variation.用于分析人类基因变异影响的语言建模技术

Bioinform Biol Insights. 2025 Sep 2;19:11779322251358314. doi: 10.1177/11779322251358314. eCollection 2025.

Towards improved fine-mapping of candidate causal variants.迈向对候选因果变异更精细的定位。

Nat Rev Genet. 2025 Jul 28. doi: 10.1038/s41576-025-00869-4.

Proteomic risk scores for predicting common diseases using linear and neural network models in the UK biobank.在英国生物银行中使用线性和神经网络模型预测常见疾病的蛋白质组学风险评分。

Sci Rep. 2025 Jul 1;15(1):20520. doi: 10.1038/s41598-025-06232-1.

Linking regulatory variants to target genes by integrating single-cell multiome methods and genomic distance.通过整合单细胞多组学方法和基因组距离将调控变异与靶基因联系起来。

Nat Genet. 2025 Jun 12. doi: 10.1038/s41588-025-02220-3.

Non-coding variation in dementias: mechanisms, insights, and challenges.痴呆症中的非编码变异：机制、见解与挑战。

NPJ Dement. 2025;1(1):9. doi: 10.1038/s44400-025-00012-4. Epub 2025 Jun 3.

Developing a general AI model for integrating diverse genomic modalities and comprehensive genomic knowledge.开发一个用于整合多种基因组模式和全面基因组知识的通用人工智能模型。

bioRxiv. 2025 May 14:2025.05.08.652986. doi: 10.1101/2025.05.08.652986.

JOB: Japan Omics Browser provides integrative visualization of multi-omics data.任务：日本组学浏览器提供多组学数据的综合可视化。

BMC Genomics. 2025 May 7;26(1):451. doi: 10.1186/s12864-025-11639-1.

Mapping the regulatory effects of common and rare non-coding variants across cellular and developmental contexts in the brain and heart.绘制大脑和心脏中常见和罕见非编码变异在细胞和发育背景下的调控效应图谱。

bioRxiv. 2025 Feb 20:2025.02.18.638922. doi: 10.1101/2025.02.18.638922.

Benchmarking DNA Sequence Models for Causal Regulatory Variant Prediction in Human Genetics.用于人类遗传学中因果调控变异预测的DNA序列模型基准测试

bioRxiv. 2025 Mar 4:2025.02.11.637758. doi: 10.1101/2025.02.11.637758.

ChromBPNet: bias factorized, base-resolution deep learning models of chromatin accessibility reveal cis-regulatory sequence syntax, transcription factor footprints and regulatory variants.ChromBPNet：染色质可及性的偏差分解、碱基分辨率深度学习模型揭示顺式调控序列语法、转录因子足迹和调控变异体

bioRxiv. 2025 Jan 8:2024.12.25.630221. doi: 10.1101/2024.12.25.630221.

本文引用的文献

Leveraging polygenic enrichments of gene features to predict genes underlying complex traits and diseases.利用基因特征的多基因富集来预测复杂性状和疾病的潜在基因。

Nat Genet. 2023 Aug;55(8):1267-1276. doi: 10.1038/s41588-023-01443-6. Epub 2023 Jul 13.

A simple new approach to variable selection in regression, with application to genetic fine mapping.一种用于回归中变量选择的简单新方法及其在基因精细定位中的应用。

J R Stat Soc Series B Stat Methodol. 2020 Dec;82(5):1273-1300. doi: 10.1111/rssb.12388. Epub 2020 Jul 10.

Identifying causal variants by fine mapping across multiple studies.通过在多个研究中进行精细映射来识别因果变异。

PLoS Genet. 2021 Sep 20;17(9):e1009733. doi: 10.1371/journal.pgen.1009733. eCollection 2021 Sep.

Functionally informed fine-mapping and polygenic localization of complex trait heritability.功能信息指导的复杂性状遗传力精细映射和多基因定位。

Nat Genet. 2020 Dec;52(12):1355-1363. doi: 10.1038/s41588-020-00735-5. Epub 2020 Nov 16.

Predicting 3D genome folding from DNA sequence with Akita.利用赤池信息准则预测 DNA 序列的三维基因组折叠

Nat Methods. 2020 Nov;17(11):1111-1117. doi: 10.1038/s41592-020-0958-x. Epub 2020 Oct 12.

The GTEx Consortium atlas of genetic regulatory effects across human tissues.GTEx 联盟人类组织遗传调控效应图谱

Science. 2020 Sep 11;369(6509):1318-1330. doi: 10.1126/science.aaz1776.

Cross-species regulatory sequence activity prediction.跨物种调控序列活性预测。

PLoS Comput Biol. 2020 Jul 20;16(7):e1008050. doi: 10.1371/journal.pcbi.1008050. eCollection 2020 Jul.

Deep learning for genomics using Janggu.使用 Janggu 进行基因组学的深度学习。

Nat Commun. 2020 Jul 13;11(1):3488. doi: 10.1038/s41467-020-17155-y.

The mutational constraint spectrum quantified from variation in 141,456 humans.从 141456 名人类个体的变异中量化的突变约束谱。

Nature. 2020 May;581(7809):434-443. doi: 10.1038/s41586-020-2308-7. Epub 2020 May 27.

Predicting mRNA Abundance Directly from Genomic Sequence Using Deep Convolutional Neural Networks.利用深度卷积神经网络直接从基因组序列预测 mRNA 丰度。

Cell Rep. 2020 May 19;31(7):107663. doi: 10.1016/j.celrep.2020.107663.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

利用监督学习对 cis-eQTL 进行功能信息精细映射，可确定另外 20,913 个假定的因果 eQTL。

Leveraging supervised learning for functionally informed fine-mapping of cis-eQTLs identifies an additional 20,913 putative causal eQTLs.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献