Suppr超能文献

转录因子与非编码变异结合的系统分析。

Systematic analysis of binding of transcription factors to noncoding variants.

机构信息

School of Medicine, Northwest University, Xi'an, China.

Ludwig Institute for Cancer Research, La Jolla, CA, USA.

出版信息

Nature. 2021 Mar;591(7848):147-151. doi: 10.1038/s41586-021-03211-0. Epub 2021 Jan 27.

Abstract

Many sequence variants have been linked to complex human traits and diseases, but deciphering their biological functions remains challenging, as most of them reside in noncoding DNA. Here we have systematically assessed the binding of 270 human transcription factors to 95,886 noncoding variants in the human genome using an ultra-high-throughput multiplex protein-DNA binding assay, termed single-nucleotide polymorphism evaluation by systematic evolution of ligands by exponential enrichment (SNP-SELEX). The resulting 828 million measurements of transcription factor-DNA interactions enable estimation of the relative affinity of these transcription factors to each variant in vitro and evaluation of the current methods to predict the effects of noncoding variants on transcription factor binding. We show that the position weight matrices of most transcription factors lack sufficient predictive power, whereas the support vector machine combined with the gapped k-mer representation show much improved performance, when assessed on results from independent SNP-SELEX experiments involving a new set of 61,020 sequence variants. We report highly predictive models for 94 human transcription factors and demonstrate their utility in genome-wide association studies and understanding of the molecular pathways involved in diverse human traits and diseases.

摘要

许多序列变体与复杂的人类特征和疾病有关,但破译它们的生物学功能仍然具有挑战性,因为它们大多数位于非编码 DNA 中。在这里,我们使用一种称为通过指数富集的配体系统进化进行单核苷酸多态性评估(SNP-SELEX)的超高通量多重蛋白质-DNA 结合测定法,系统地评估了 270 个人类转录因子与人类基因组中 95886 个非编码变体的结合。由此产生的 8.28 亿个转录因子-DNA 相互作用的测量结果可用于体外估计这些转录因子对每个变体的相对亲和力,并评估当前预测非编码变体对转录因子结合影响的方法。我们表明,大多数转录因子的位置权重矩阵缺乏足够的预测能力,而支持向量机与缺口 k-mer 表示相结合时,在评估涉及新的 61020 个序列变体的独立 SNP-SELEX 实验的结果时,表现出更好的性能。我们报告了 94 个人类转录因子的高度预测模型,并证明了它们在全基因组关联研究以及理解涉及多种人类特征和疾病的分子途径中的应用。

相似文献

1
Systematic analysis of binding of transcription factors to noncoding variants.
Nature. 2021 Mar;591(7848):147-151. doi: 10.1038/s41586-021-03211-0. Epub 2021 Jan 27.
3
SELEX-Seq: A Method to Determine DNA Binding Specificities of Plant Transcription Factors.
Methods Mol Biol. 2017;1629:67-82. doi: 10.1007/978-1-4939-7125-1_6.
4
GERV: a statistical method for generative evaluation of regulatory variants for transcription factor binding.
Bioinformatics. 2016 Feb 15;32(4):490-6. doi: 10.1093/bioinformatics/btv565. Epub 2015 Oct 17.
7
Positional weight matrices have sufficient prediction power for analysis of noncoding variants.
F1000Res. 2022 Jan 12;11:33. doi: 10.12688/f1000research.75471.3. eCollection 2022.
10
DNA-binding properties of the MADS-domain transcription factor SEPALLATA3 and mutant variants characterized by SELEX-seq.
Plant Mol Biol. 2021 Mar;105(4-5):543-557. doi: 10.1007/s11103-020-01108-6. Epub 2021 Jan 24.

引用本文的文献

1
Multiple overlapping binding sites determine transcription factor occupancy.
Nature. 2025 Sep 3. doi: 10.1038/s41586-025-09472-3.
2
Machine learning tools for deciphering the regulatory logic of enhancers in health and disease.
Front Genet. 2025 Aug 13;16:1603687. doi: 10.3389/fgene.2025.1603687. eCollection 2025.
3
fSuSiE enables fine-mapping of QTLs from genome-scale molecular profiles.
bioRxiv. 2025 Aug 17:2025.08.17.670732. doi: 10.1101/2025.08.17.670732.
4
Genetic transcriptional regulation profiling of cartilage reveals pathogenesis of osteoarthritis.
EBioMedicine. 2025 Jul;117:105821. doi: 10.1016/j.ebiom.2025.105821. Epub 2025 Jun 26.
5
Many transcription factor families have evolutionarily conserved binding motifs in plants.
Plant Physiol. 2025 May 30;198(2). doi: 10.1093/plphys/kiaf205.
7
Functional genomics in age-related macular degeneration: From genetic associations to understanding disease mechanisms.
Exp Eye Res. 2025 May;254:110344. doi: 10.1016/j.exer.2025.110344. Epub 2025 Mar 13.
10
Cardiovascular disease-associated non-coding variants disrupt GATA4-DNA binding and regulatory functions.
HGG Adv. 2025 Apr 10;6(2):100415. doi: 10.1016/j.xhgg.2025.100415. Epub 2025 Feb 12.

本文引用的文献

3
Fine-mapping type 2 diabetes loci to single-variant resolution using high-density imputation and islet-specific epigenome maps.
Nat Genet. 2018 Nov;50(11):1505-1513. doi: 10.1038/s41588-018-0241-6. Epub 2018 Oct 8.
4
Antidepressive effects of targeting ELK-1 signal transduction.
Nat Med. 2018 May;24(5):591-597. doi: 10.1038/s41591-018-0011-0. Epub 2018 May 7.
5
Genome-wide association analyses identify 44 risk variants and refine the genetic architecture of major depression.
Nat Genet. 2018 May;50(5):668-681. doi: 10.1038/s41588-018-0090-3. Epub 2018 Apr 26.
7
Genetic effects on gene expression across human tissues.
Nature. 2017 Oct 11;550(7675):204-213. doi: 10.1038/nature24277.
8
Association analyses based on false discovery rate implicate new loci for coronary artery disease.
Nat Genet. 2017 Sep;49(9):1385-1391. doi: 10.1038/ng.3913. Epub 2017 Jul 17.
9
10
BEESEM: estimation of binding energy models using HT-SELEX data.
Bioinformatics. 2017 Aug 1;33(15):2288-2295. doi: 10.1093/bioinformatics/btx191.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验