上下文特异性转录因子预测

Context specific transcription factor prediction.

作者信息

Yang Eric, Simcha David, Almon Richard R, Dubois Debra C, Jusko William J, Androulakis Ioannis P

机构信息

Biomedical Engineering Department, Rutgers University, Piscataway, NJ 08854, USA.

出版信息

Ann Biomed Eng. 2007 Jun;35(6):1053-67. doi: 10.1007/s10439-007-9268-z. Epub 2007 Mar 22.

DOI:10.1007/s10439-007-9268-z

PMID:17377845

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC4184431/

Abstract

One of the goals of systems biology is the identification of regulatory mechanisms that govern an organism's response to external stimuli. Transcription factors have been hypothesized as a major contributor to an organism's response to various outside stimuli, and a great deal of work has been done to predict the set of transcription factors which regulate a given gene. Most of the current methods seek to identify possible binding sites from genomic sequence. Initial attempts at predicting transcription factors from genomic sequences suffered from the problem of false positives. Making the problem more difficult, it has also been shown that while predicted binding sites might be false positives, they can be shown to bind to their corresponding sequences in vitro. One method for rectifying this is through the use of phylogenetic analysis in which only regions which show high evolutionary conservation are analyzed. However such an approach may be too stringent because of the level of degeneracy shown in transcription factor binding site position weight matrices. Due to the degeneracy, there may be only a few bases that need to be conserved across species. Therefore, while a sequence may not show a high level of evolutionary conservation, these sequences may still show high affinity for the same transcription factor. In predicting transcription factor binding we explore the notion that "Co-expression implies co-regulation" [Allocco et al. BMC Bioinformatics 5:18, 2004]. With multiple genes requiring similar transcription factors binding sites, there exists a basis for eliminating false positives. This method allows for the selection of transcription factors binding sites that are active under a given experimental paradigm, thereby allowing us to indirectly incorporate the effects of chromosome and recognition site presentation upon transcription factor binding prediction. Rather than having to rationalize that a few transcription factors binding sites are over-represented in a cluster of genes, one can show that a few transcription factors are active in the cluster of genes that have been grouped together. Although the method focuses on predicting experiment-specific transcription factor binding sites, it is possible that if such a methodology were used in an iterative process where different experiments were analyzed, one could obtain a comprehensive set of transcription factors binding sites which regulate the various dynamic responses shown by biological systems under a variety of conditions hence building a more comprehensive model of transcriptional regulation.

摘要

系统生物学的目标之一是识别调控生物体对外部刺激反应的调节机制。转录因子被认为是生物体对各种外部刺激反应的主要贡献者，并且已经开展了大量工作来预测调控给定基因的转录因子集合。当前大多数方法试图从基因组序列中识别可能的结合位点。最初从基因组序列预测转录因子的尝试存在假阳性问题。使问题更加困难的是，研究还表明，虽然预测的结合位点可能是假阳性，但它们在体外可以显示与相应序列结合。纠正这一问题的一种方法是通过系统发育分析，其中仅分析显示高度进化保守性的区域。然而，由于转录因子结合位点位置权重矩阵中显示的简并程度，这种方法可能过于严格。由于简并性，跨物种可能仅需要几个碱基保守。因此，虽然一个序列可能没有显示出高度的进化保守性，但这些序列可能仍然对相同的转录因子显示出高亲和力。在预测转录因子结合时，我们探讨了“共表达意味着共调控”的概念[Allocco等人，《BMC生物信息学》5:18，2004年]。对于多个需要相似转录因子结合位点的基因，存在消除假阳性的基础。这种方法允许选择在给定实验范式下活跃的转录因子结合位点，从而使我们能够间接纳入染色体和识别位点呈现对转录因子结合预测的影响。不必去解释为什么几个转录因子结合位点在一组基因中过度代表，而是可以表明几个转录因子在聚集在一起的基因簇中是活跃的。虽然该方法侧重于预测实验特异性转录因子结合位点，但如果在分析不同实验的迭代过程中使用这种方法，有可能获得一组全面的转录因子结合位点，这些位点调控生物系统在各种条件下显示的各种动态反应，从而建立一个更全面的转录调控模型。

相似文献

Context specific transcription factor prediction.

Ann Biomed Eng. 2007 Jun;35(6):1053-67. doi: 10.1007/s10439-007-9268-z. Epub 2007 Mar 22.

Integrating genomic data to predict transcription factor binding.

Genome Inform. 2005;16(1):83-94.

A multiple-feature framework for modelling and predicting transcription factor binding sites.

Bioinformatics. 2005 Jul 15;21(14):3082-8. doi: 10.1093/bioinformatics/bti477. Epub 2005 May 19.

BLISS: binding site level identification of shared signal-modules in DNA regulatory sequences.

BMC Bioinformatics. 2006 Jun 7;7:287. doi: 10.1186/1471-2105-7-287.

WeederH: an algorithm for finding conserved regulatory motifs and regions in homologous sequences.

BMC Bioinformatics. 2007 Feb 7;8:46. doi: 10.1186/1471-2105-8-46.

CONREAL: conserved regulatory elements anchored alignment algorithm for identification of transcription factor binding sites by phylogenetic footprinting.

Genome Res. 2004 Jan;14(1):170-8. doi: 10.1101/gr.1642804. Epub 2003 Dec 12.

Transcription binding site prediction using Markov models.

J Bioinform Comput Biol. 2006 Apr;4(2):425-41. doi: 10.1142/s0219720006001813.

Molecular and structural considerations of TF-DNA binding for the generation of biologically meaningful and accurate phylogenetic footprinting analysis: the LysR-type transcriptional regulator family as a study model.

BMC Genomics. 2016 Aug 27;17(1):686. doi: 10.1186/s12864-016-3025-3.

Using TESS to predict transcription factor binding sites in DNA sequence.

Curr Protoc Bioinformatics. 2008 Mar;Chapter 2:Unit 2.6. doi: 10.1002/0471250953.bi0206s21.

Sequence features of DNA binding sites reveal structural class of associated transcription factor.

Bioinformatics. 2006 Jan 15;22(2):157-63. doi: 10.1093/bioinformatics/bti731. Epub 2005 Nov 2.

引用本文的文献

Understanding Physiology in the Continuum: Integration of Information from Multiple - Levels.

Front Pharmacol. 2017 Feb 27;8:91. doi: 10.3389/fphar.2017.00091. eCollection 2017.

Gene Regulatory Network Inference of Immunoresponsive Gene 1 (IRG1) Identifies Interferon Regulatory Factor 1 (IRF1) as Its Transcriptional Regulator in Mammalian Macrophages.

PLoS One. 2016 Feb 12;11(2):e0149050. doi: 10.1371/journal.pone.0149050. eCollection 2016.

A computational approach for genome-wide mapping of splicing factor binding sites.

Genome Biol. 2009;10(3):R30. doi: 10.1186/gb-2009-10-3-r30. Epub 2009 Mar 18.

本文引用的文献

FootPrinter3: phylogenetic footprinting in partially alignable sequences.

Nucleic Acids Res. 2006 Jul 1;34(Web Server issue):W617-20. doi: 10.1093/nar/gkl123.

MicroFootPrinter: a tool for phylogenetic footprinting in prokaryotic genomes.

Nucleic Acids Res. 2006 Jul 1;34(Web Server issue):W366-8. doi: 10.1093/nar/gkl069.

Cracking the genome's second code: enhancer detection by combined phylogenetic footprinting and transgenic fish and frog embryos.

Methods. 2006 Jul;39(3):212-9. doi: 10.1016/j.ymeth.2005.12.005.

Uncovering a hidden distributed architecture behind scale-free transcriptional regulatory networks.

J Mol Biol. 2006 Jun 30;360(1):204-12. doi: 10.1016/j.jmb.2006.04.026. Epub 2006 Apr 27.

A clustering property of highly-degenerate transcription factor binding sites in the mammalian genome.

Nucleic Acids Res. 2006 May 2;34(8):2238-46. doi: 10.1093/nar/gkl248. Print 2006.

Familial combined hyperlipidemia: upstream transcription factor 1 and beyond.

Curr Opin Lipidol. 2006 Apr;17(2):101-9. doi: 10.1097/01.mol.0000217890.54875.13.

Evolutionary dynamics of prokaryotic transcriptional regulatory networks.

J Mol Biol. 2006 Apr 28;358(2):614-33. doi: 10.1016/j.jmb.2006.02.019. Epub 2006 Feb 28.

Integrating genomic data to predict transcription factor binding.

Genome Inform. 2005;16(1):83-94.

Topology, tinkering and evolution of the human transcription factor network.

FEBS J. 2005 Dec;272(24):6423-34. doi: 10.1111/j.1742-4658.2005.05041.x.

Systematic survey reveals general applicability of "guilt-by-association" within gene coexpression networks.

BMC Bioinformatics. 2005 Sep 14;6:227. doi: 10.1186/1471-2105-6-227.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

上下文特异性转录因子预测

Context specific transcription factor prediction.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献