Suppr超能文献

从 ChIP-seq 数据中发现未知的人和小鼠转录因子结合位点及其特征。

Discovering unknown human and mouse transcription factor binding sites and their characteristics from ChIP-seq data.

机构信息

Biodiversity Research Center, Academia Sinica, 115 Taipei, Taiwan.

Institute for Comparative Genomics, American Museum of Natural History, New York, NY 10024.

出版信息

Proc Natl Acad Sci U S A. 2021 May 18;118(20). doi: 10.1073/pnas.2026754118.

Abstract

Transcription factor binding sites (TFBSs) are essential for gene regulation, but the number of known TFBSs remains limited. We aimed to discover and characterize unknown TFBSs by developing a computational pipeline for analyzing ChIP-seq (chromatin immunoprecipitation followed by sequencing) data. Applying it to the latest ENCODE ChIP-seq data for human and mouse, we found that using the irreproducible discovery rate as a quality-control criterion resulted in many experiments being unnecessarily discarded. By contrast, the number of motif occurrences in ChIP-seq peak regions provides a highly effective criterion, which is reliable even if supported by only one experimental replicate. In total, we obtained 2,058 motifs from 1,089 experiments for 354 human TFs and 163 motifs from 101 experiments for 34 mouse TFs. Among these motifs, 487 have not previously been reported. Mapping the canonical motifs to the human genome reveals a high TFBS density ±2 kb around transcription start sites (TSSs) with a peak at -50 bp. On average, a promoter contains 5.7 TFBSs. However, 70% of TFBSs are in introns (41%) and intergenic regions (29%), whereas only 12% are in promoters (-1 kb to +100 bp from TSSs). Notably, some TFs (e.g., CTCF, JUN, JUNB, and NFE2) have motifs enriched in intergenic regions, including enhancers. We inferred 142 cobinding TF pairs and 186 (including 115 completely) tethered binding TF pairs, indicating frequent interactions between TFs and a higher frequency of tethered binding than cobinding. This study provides a large number of previously undocumented motifs and insights into the biological and genomic features of TFBSs.

摘要

转录因子结合位点 (TFBSs) 对于基因调控至关重要,但已知的 TFBSs 数量仍然有限。我们旨在通过开发一种分析 ChIP-seq(染色质免疫沉淀 followed by sequencing)数据的计算管道来发现和描述未知的 TFBSs。将其应用于人类和小鼠的最新 ENCODE ChIP-seq 数据,我们发现使用不可重现发现率作为质量控制标准会导致许多实验被不必要地丢弃。相比之下,ChIP-seq 峰区域中的基序出现次数提供了一个非常有效的标准,即使仅由一个实验重复支持,该标准也是可靠的。总共,我们从 354 个人类 TF 的 1,089 个实验中获得了 2,058 个基序,从 34 个小鼠 TF 的 101 个实验中获得了 163 个基序。其中,487 个基序以前没有报道过。将典型基序映射到人类基因组上,在转录起始位点 (TSS) 周围的 ±2 kb 处显示出高 TFBS 密度,峰值在 -50 bp。平均而言,一个启动子包含 5.7 个 TFBS。然而,70%的 TFBS 位于内含子 (41%) 和基因间区域 (29%),而只有 12%位于启动子 (-1 kb 到 +100 bp 从 TSSs)。值得注意的是,一些 TF(例如 CTCF、JUN、JUNB 和 NFE2)在基因间区域,包括增强子中具有富集的基序。我们推断出 142 对共结合 TF 对和 186 对(包括 115 对完全)连接结合 TF 对,表明 TF 之间存在频繁的相互作用和比共结合更高频率的连接结合。这项研究提供了大量以前未记录的基序,并深入了解了 TFBSs 的生物学和基因组特征。

相似文献

8
The next generation of transcription factor binding site prediction.下一代转录因子结合位点预测。
PLoS Comput Biol. 2013;9(9):e1003214. doi: 10.1371/journal.pcbi.1003214. Epub 2013 Sep 5.

引用本文的文献

5
Stroke-associated intergenic variants modulate a human FOXF2 transcriptional enhancer.与中风相关的基因间变异调控人类 FOXF2 转录增强子。
Proc Natl Acad Sci U S A. 2022 Aug 30;119(35):e2121333119. doi: 10.1073/pnas.2121333119. Epub 2022 Aug 22.

本文引用的文献

2
Global reference mapping of human transcription factor footprints.人类转录因子足迹的全球参考图谱绘制。
Nature. 2020 Jul;583(7818):729-736. doi: 10.1038/s41586-020-2528-x. Epub 2020 Jul 29.
6
Gene regulatory network inference resources: A practical overview.基因调控网络推断资源:实用概述。
Biochim Biophys Acta Gene Regul Mech. 2020 Jun;1863(6):194430. doi: 10.1016/j.bbagrm.2019.194430. Epub 2019 Oct 31.
9
The Human Transcription Factors.人类转录因子。
Cell. 2018 Oct 4;175(2):598-599. doi: 10.1016/j.cell.2018.09.045.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验