Suppr
超能文献

基因组范围的转录因子结合位点/启动子数据库，用于分析基因集和转录因子结合模体的共发生。

Genome-wide transcription factor binding site/promoter databases for the analysis of gene sets and co-occurrence of transcription factor binding motifs.

机构信息

Department of Oncology, Clinical Sciences, Lund University and Lund University Hospital, SE-22185 LUND, Sweden.

出版信息

BMC Genomics. 2010 Mar 1;11:145. doi: 10.1186/1471-2164-11-145.

DOI:10.1186/1471-2164-11-145

PMID:20193056

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC2841680/

Abstract

BACKGROUND

The use of global gene expression profiling is a well established approach to understand biological processes. One of the major goals of these investigations is to identify sets of genes with similar expression patterns. Such gene signatures may be very informative and reveal new aspects of particular biological processes. A logical and systematic next step is to reduce the identified gene signatures to the regulatory components that induce the relevant gene expression changes. A central issue in this context is to identify transcription factors, or transcription factor binding sites (TFBS), likely to be of importance for the expression of the gene signatures.

RESULTS

We develop a strategy that efficiently produces TFBS/promoter databases based on user-defined criteria. The resulting databases constitute all genes in the Santa Cruz database and the positions for all TFBS provided by the user as position weight matrices. These databases are then used for two purposes, to identify significant TFBS in the promoters in sets of genes and to identify clusters of co-occurring TFBS. We use two criteria for significance, significantly enriched TFBS in terms of total number of binding sites for the promoters, and significantly present TFBS in terms of the fraction of promoters with binding sites. Significant TFBS are identified by a re-sampling procedure in which the query gene set is compared with typically 10(5) gene lists of similar size randomly drawn from the TFBS/promoter database. We apply this strategy to a large number of published ChIP-Chip data sets and show that the proposed approach faithfully reproduces ChIP-Chip results. The strategy also identifies relevant TFBS when analyzing gene signatures obtained from the MSigDB database. In addition, we show that several TFBS are highly correlated and that co-occurring TFBS define functionally related sets of genes.

CONCLUSIONS

The presented approach of promoter analysis faithfully reproduces the results from several ChIP-Chip and MigDB derived gene sets and hence may prove to be an important method in the analysis of gene signatures obtained through ChIP-Chip or global gene expression experiments. We show that TFBS are organized in clusters of co-occurring TFBS that together define highly coherent sets of genes.

摘要

背景

使用全基因组表达谱分析是理解生物学过程的一种成熟方法。这些研究的主要目标之一是识别具有相似表达模式的基因集。这样的基因特征可能非常有信息量，并揭示特定生物学过程的新方面。逻辑和系统的下一步是将鉴定出的基因特征简化为诱导相关基因表达变化的调节成分。在这种情况下的一个核心问题是鉴定可能对基因特征的表达重要的转录因子或转录因子结合位点 (TFBS)。

结果

我们开发了一种策略，可根据用户定义的标准有效地生成 TFBS/启动子数据库。由此产生的数据库包含 Santa Cruz 数据库中的所有基因和用户提供的所有 TFBS 的位置作为位置权重矩阵。然后，这些数据库用于两个目的，即在基因集中鉴定显著的 TFBS，并鉴定共同出现的 TFBS 簇。我们使用两个标准来确定显著性，即根据启动子的总结合位点数量来衡量 TFBS 是否显著富集，以及根据具有结合位点的启动子的分数来衡量 TFBS 是否显著存在。通过重新抽样程序来识别显著的 TFBS，其中将查询基因集与通常从 TFBS/启动子数据库中随机抽取的大小相似的 10(5)个基因列表进行比较。我们将这种策略应用于大量已发表的 ChIP-Chip 数据集，并表明所提出的方法忠实地再现了 ChIP-Chip 结果。该策略在分析从 MSigDB 数据库获得的基因特征时也能识别相关的 TFBS。此外，我们表明几个 TFBS 高度相关，并且共同出现的 TFBS 定义了功能相关的基因集。

结论

所提出的启动子分析方法忠实地再现了来自多个 ChIP-Chip 和 MigDB 衍生基因集的结果，因此可能成为通过 ChIP-Chip 或全基因组表达实验获得的基因特征分析的重要方法。我们表明，TFBS 组织在共同出现的 TFBS 簇中，这些簇共同定义了高度一致的基因集。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d837/2841680/2bf85db8e1af/1471-2164-11-145-1.jpg

相似文献

Genome-wide transcription factor binding site/promoter databases for the analysis of gene sets and co-occurrence of transcription factor binding motifs.

BMC Genomics. 2010 Mar 1;11:145. doi: 10.1186/1471-2164-11-145.

Integrating genomic data to predict transcription factor binding.

Genome Inform. 2005;16(1):83-94.

Assessment of clusters of transcription factor binding sites in relationship to human promoter, CpG islands and gene expression.

BMC Genomics. 2004 Feb 23;5(1):16. doi: 10.1186/1471-2164-5-16.

CORE_TF: a user-friendly interface to identify evolutionary conserved transcription factor binding sites in sets of co-regulated genes.

BMC Bioinformatics. 2008 Nov 26;9:495. doi: 10.1186/1471-2105-9-495.

All and only CpG containing sequences are enriched in promoters abundantly bound by RNA polymerase II in multiple tissues.

BMC Genomics. 2008 Feb 5;9:67. doi: 10.1186/1471-2164-9-67.

Profiling of Human Molecular Pathways Affected by Retrotransposons at the Level of Regulation by Transcription Factor Proteins.

Front Immunol. 2018 Jan 30;9:30. doi: 10.3389/fimmu.2018.00030. eCollection 2018.

Evolutionary conservation of zinc finger transcription factor binding sites in promoters of genes co-expressed with WT1 in prostate cancer.

BMC Genomics. 2008 Jul 16;9:337. doi: 10.1186/1471-2164-9-337.

A provisional gene regulatory atlas for mouse heart development.

PLoS One. 2014 Jan 8;9(1):e83364. doi: 10.1371/journal.pone.0083364. eCollection 2014.

Transcription factor binding site clusters identify target genes with similar tissue-wide expression and buffer against mutations.

F1000Res. 2018 Dec 14;7:1933. doi: 10.12688/f1000research.17363.2. eCollection 2018.

Establishing and validating regulatory regions for variant annotation and expression analysis.

BMC Genomics. 2016 Jun 23;17 Suppl 2(Suppl 2):393. doi: 10.1186/s12864-016-2724-0.

引用本文的文献

General Designs Reveal Distinct Codes in Protein-Coding and Non-Coding Human DNA.

Genes (Basel). 2022 Oct 28;13(11):1970. doi: 10.3390/genes13111970.

Gene regulatory network analysis defines transcriptome landscape with alternative splicing of human umbilical vein endothelial cells during replicative senescence.

BMC Genomics. 2021 Dec 2;22(1):869. doi: 10.1186/s12864-021-08185-x.

A transcriptome-based approach to identify functional modules within and across primary human immune cells.

PLoS One. 2020 May 29;15(5):e0233543. doi: 10.1371/journal.pone.0233543. eCollection 2020.

Revealing transcription factor and histone modification co-localization and dynamics across cell lines by integrating ChIP-seq and RNA-seq data.

BMC Genomics. 2018 Dec 31;19(Suppl 10):914. doi: 10.1186/s12864-018-5278-5.

Canonical and single-cell Hi-C reveal distinct chromatin interaction sub-networks of mammalian transcription factors.

Genome Biol. 2018 Oct 25;19(1):174. doi: 10.1186/s13059-018-1558-2.

Hypertension reduces soluble guanylyl cyclase expression in the mouse aorta via the Notch signaling pathway.

Sci Rep. 2017 May 2;7(1):1334. doi: 10.1038/s41598-017-01392-1.

Methodology for single nucleotide polymorphism selection in promoter regions for clinical use. An example of its applicability.

Int J Mol Epidemiol Genet. 2016 Sep 30;7(3):126-136. eCollection 2016.

Assessing the contribution of thrombospondin-4 induction and ATF6α activation to endoplasmic reticulum expansion and phenotypic modulation in bladder outlet obstruction.

Sci Rep. 2016 Sep 1;6:32449. doi: 10.1038/srep32449.

The effect of non-coding DNA variations on P53 and cMYC competitive inhibition at cis-overlapping motifs.

Hum Mol Genet. 2016 Apr 15;25(8):1517-27. doi: 10.1093/hmg/ddw030. Epub 2016 Feb 7.

Promoter-level expression clustering identifies time development of transcriptional regulatory cascades initiated by ErbB receptors in breast cancer cells.

Sci Rep. 2015 Jul 16;5:11999. doi: 10.1038/srep11999.

本文引用的文献

Regulation of clock-controlled genes in mammals.

PLoS One. 2009;4(3):e4882. doi: 10.1371/journal.pone.0004882. Epub 2009 Mar 16.

Genome-wide occupancy of SREBP1 and its partners NFY and SP1 reveals novel functional roles and combinatorial regulation of distinct classes of genes.

PLoS Genet. 2008 Jul 25;4(7):e1000133. doi: 10.1371/journal.pgen.1000133.

Characterization of genome-wide p53-binding sites upon stress response.

Nucleic Acids Res. 2008 Jun;36(11):3639-54. doi: 10.1093/nar/gkn232. Epub 2008 May 12.

Using TESS to predict transcription factor binding sites in DNA sequence.

Curr Protoc Bioinformatics. 2008 Mar;Chapter 2:Unit 2.6. doi: 10.1002/0471250953.bi0206s21.

JASPAR, the open access database of transcription factor-binding profiles: new content and tools in the 2008 update.

Nucleic Acids Res. 2008 Jan;36(Database issue):D102-6. doi: 10.1093/nar/gkm955. Epub 2007 Nov 15.

A comprehensive ChIP-chip analysis of E2F1, E2F4, and E2F6 in normal and tumor cells reveals interchangeable roles of E2F family members.

Genome Res. 2007 Nov;17(11):1550-61. doi: 10.1101/gr.6783507. Epub 2007 Oct 1.

Global mapping of c-Myc binding sites and target gene networks in human B cells.

Proc Natl Acad Sci U S A. 2006 Nov 21;103(47):17834-9. doi: 10.1073/pnas.0604129103. Epub 2006 Nov 8.

Analysis of promoter regions of co-expressed genes identified by microarray analysis.

BMC Bioinformatics. 2006 Aug 17;7:384. doi: 10.1186/1471-2105-7-384.

Genome-wide prediction of transcriptional regulatory elements of human promoters using gene expression and promoter analysis data.

BMC Bioinformatics. 2006 Jul 4;7:330. doi: 10.1186/1471-2105-7-330.

Transcriptional regulatory networks downstream of TAL1/SCL in T-cell acute lymphoblastic leukemia.

Blood. 2006 Aug 1;108(3):986-92. doi: 10.1182/blood-2005-08-3482. Epub 2006 Apr 18.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

Suppr超能文献

基因组范围的转录因子结合位点/启动子数据库，用于分析基因集和转录因子结合模体的共发生。

Genome-wide transcription factor binding site/promoter databases for the analysis of gene sets and co-occurrence of transcription factor binding motifs.

机构信息

出版信息

BACKGROUND

RESULTS

CONCLUSIONS

背景

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译