通过对直系同源启动子序列进行无比对和基于亲和力的分析来预测功能性转录因子结合

Predicting functional transcription factor binding through alignment-free and affinity-based analysis of orthologous promoter sequences.

作者信息

Ward Lucas D, Bussemaker Harmen J

机构信息

Department of Biological Sciences, Columbia University, New York, NY 10027, USA.

出版信息

Bioinformatics. 2008 Jul 1;24(13):i165-71. doi: 10.1093/bioinformatics/btn154.

DOI:10.1093/bioinformatics/btn154

PMID:18586710

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC2718632/

Abstract

MOTIVATION

The identification of transcription factor (TF) binding sites and the regulatory circuitry that they define is currently an area of intense research. Data from whole-genome chromatin immunoprecipitation (ChIP-chip), whole-genome expression microarrays, and sequencing of multiple closely related genomes have all proven useful. By and large, existing methods treat the interpretation of functional data as a classification problem (between bound and unbound DNA), and the analysis of comparative data as a problem of local alignment (to recover phylogenetic footprints of presumably functional elements). Both of these approaches suffer from the inability to model and detect low-affinity binding sites, which have recently been shown to be abundant and functional.

RESULTS

We have developed a method that discovers functional regulatory targets of TFs by predicting the total affinity of each promoter for those factors and then comparing that affinity across orthologous promoters in closely related species. At each promoter, we consider the minimum affinity among orthologs to be the fraction of the affinity that is functional. Because we calculate the affinity of the entire promoter, our method is independent of local alignment. By comparing with functional annotation information and gene expression data in Saccharomyces cerevisiae, we have validated that this biophysically motivated use of evolutionary conservation gives rise to dramatic improvement in prediction of regulatory connectivity and factor-factor interactions compared to the use of a single genome. We propose novel biological functions for several yeast TFs, including the factors Snt2 and Stb4, for which no function has been reported. Our affinity-based approach towards comparative genomics may allow a more quantitative analysis of the principles governing the evolution of non-coding DNA.

AVAILABILITY

The MatrixREDUCE software package is available from http://www.bussemakerlab.org/software/MatrixREDUCE.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

转录因子（TF）结合位点及其所定义的调控网络的识别是当前一个研究热点领域。来自全基因组染色质免疫沉淀（ChIP-chip）、全基因组表达微阵列以及多个密切相关基因组测序的数据都已证明是有用的。总体而言，现有方法将功能数据的解释视为分类问题（区分结合和未结合的DNA），而将比较数据的分析视为局部比对问题（以恢复可能的功能元件的系统发育足迹）。这两种方法都存在无法对低亲和力结合位点进行建模和检测的问题，而最近已表明这类位点数量众多且具有功能。

结果

我们开发了一种方法，通过预测每个启动子对这些因子的总亲和力，然后比较密切相关物种中直系同源启动子之间的亲和力，来发现TF的功能调控靶点｡在每个启动子处，我们将直系同源物之间的最小亲和力视为具有功能活性的亲和力部分｡由于我们计算的是整个启动子的亲和力｡我们的方法独立于局部比对｡通过与酿酒酵母中的功能注释信息和基因表达数据进行比较，我们已经验证，与使用单个基因组相比，这种基于生物物理学动机利用进化保守性对调控连接性和因子 - 因子相互作用的预测带来了显著改进｡我们为几种酵母TF提出了新的生物学功能，包括尚未报道过功能的Snt2和Stb4因子｡我们基于亲和力的比较基因组学方法可能允许对非编码DNA进化所遵循的原则进行更定量的分析｡

可用性

MatrixREDUCE软件包可从http://www.bussemakerlab.org/software/MatrixREDUCE获取｡

补充信息

补充数据可在《生物信息学》在线获取｡

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/587b/2718632/d3b92eab2aad/btn154f1.jpg

相似文献

Predicting functional transcription factor binding through alignment-free and affinity-based analysis of orthologous promoter sequences.

Bioinformatics. 2008 Jul 1;24(13):i165-71. doi: 10.1093/bioinformatics/btn154.

Statistical mechanical modeling of genome-wide transcription factor occupancy data by MatrixREDUCE.

Bioinformatics. 2006 Jul 15;22(14):e141-9. doi: 10.1093/bioinformatics/btl223.

PhyloGibbs: a Gibbs sampling motif finder that incorporates phylogeny.

PLoS Comput Biol. 2005 Dec;1(7):e67. doi: 10.1371/journal.pcbi.0010067. Epub 2005 Dec 9.

BLSSpeller: exhaustive comparative discovery of conserved cis-regulatory elements.

Bioinformatics. 2015 Dec 1;31(23):3758-66. doi: 10.1093/bioinformatics/btv466. Epub 2015 Aug 8.

CONREAL: conserved regulatory elements anchored alignment algorithm for identification of transcription factor binding sites by phylogenetic footprinting.

Genome Res. 2004 Jan;14(1):170-8. doi: 10.1101/gr.1642804. Epub 2003 Dec 12.

Integrating genomic data to predict transcription factor binding.

Genome Inform. 2005;16(1):83-94.

CORE_TF: a user-friendly interface to identify evolutionary conserved transcription factor binding sites in sets of co-regulated genes.

BMC Bioinformatics. 2008 Nov 26;9:495. doi: 10.1186/1471-2105-9-495.

Comparative promoter region analysis powered by CORG.

BMC Genomics. 2005 Feb 21;6:24. doi: 10.1186/1471-2164-6-24.

Accurate anchoring alignment of divergent sequences.

Bioinformatics. 2006 Jan 1;22(1):29-34. doi: 10.1093/bioinformatics/bti772. Epub 2005 Nov 13.

A comparative analysis of relative occurrence of transcription factor binding sites in vertebrate genomes and gene promoter areas.

Bioinformatics. 2005 May 1;21(9):1789-96. doi: 10.1093/bioinformatics/bti307. Epub 2005 Feb 4.

引用本文的文献

Species-aware DNA language models capture regulatory elements and their evolution.

Genome Biol. 2024 Apr 2;25(1):83. doi: 10.1186/s13059-024-03221-x.

Identification of upstream transcription factor binding sites in orthologous genes using mixed Student's t-test statistics.

PLoS Comput Biol. 2022 Jun 7;18(6):e1009773. doi: 10.1371/journal.pcbi.1009773. eCollection 2022 Jun.

Fungal brain infection modelled in a human-neurovascular-unit-on-a-chip with a functional blood-brain barrier.

Nat Biomed Eng. 2021 Aug;5(8):830-846. doi: 10.1038/s41551-021-00743-8. Epub 2021 Jun 14.

Exploring functionally annotated transcriptional consensus regulatory elements with CONREL.

Database (Oxford). 2020 Jan 1;2020. doi: 10.1093/database/baaa071.

Gene regulatory network reconstruction using single-cell RNA sequencing of barcoded genotypes in diverse environments.

Elife. 2020 Jan 27;9:e51254. doi: 10.7554/eLife.51254.

Diversification of DNA binding specificities enabled SREBP transcription regulators to expand the repertoire of cellular functions that they govern in fungi.

PLoS Genet. 2018 Dec 31;14(12):e1007884. doi: 10.1371/journal.pgen.1007884. eCollection 2018 Dec.

Association of improved oxidative stress tolerance and alleviation of glucose repression with superior xylose-utilization capability by a natural isolate of .

Biotechnol Biofuels. 2018 Feb 5;11:28. doi: 10.1186/s13068-018-1018-y. eCollection 2018.

A functional strategy to characterize expression Quantitative Trait Loci.

Hum Genet. 2017 Nov;136(11-12):1477-1487. doi: 10.1007/s00439-017-1849-9. Epub 2017 Nov 3.

Alignment-free sequence comparison: benefits, applications, and tools.

Genome Biol. 2017 Oct 3;18(1):186. doi: 10.1186/s13059-017-1319-7.

Network-based approaches that exploit inferred transcription factor activity to analyze the impact of genetic variation on gene expression.

Curr Opin Syst Biol. 2017 Apr;2:98-102. doi: 10.1016/j.coisb.2017.04.002. Epub 2017 Apr 17.

本文引用的文献

TransfactomeDB: a resource for exploring the nucleotide sequence specificity and condition-specific regulatory activity of trans-acting factors.

Nucleic Acids Res. 2008 Jan;36(Database issue):D125-31. doi: 10.1093/nar/gkm828. Epub 2007 Oct 18.

Dissecting complex transcriptional responses using pathway-level scores based on prior information.

BMC Bioinformatics. 2007 Sep 27;8 Suppl 6(Suppl 6):S6. doi: 10.1186/1471-2105-8-S6-S6.

Predictive modeling of genome-wide mRNA expression: from modules to molecules.

Annu Rev Biophys Biomol Struct. 2007;36:329-47. doi: 10.1146/annurev.biophys.36.040306.132725.

Predicting transcription factor affinities to DNA from a biophysical model.

Bioinformatics. 2007 Jan 15;23(2):134-41. doi: 10.1093/bioinformatics/btl565. Epub 2006 Nov 10.

Statistical mechanical modeling of genome-wide transcription factor occupancy data by MatrixREDUCE.

Bioinformatics. 2006 Jul 15;22(14):e141-9. doi: 10.1093/bioinformatics/btl223.

JProGO: a novel tool for the functional interpretation of prokaryotic microarray data using Gene Ontology information.

Nucleic Acids Res. 2006 Jul 1;34(Web Server issue):W510-5. doi: 10.1093/nar/gkl329.

Extensive low-affinity transcriptional interactions in the yeast genome.

Genome Res. 2006 Aug;16(8):962-72. doi: 10.1101/gr.5113606. Epub 2006 Jun 29.

An improved map of conserved regulatory sites for Saccharomyces cerevisiae.

BMC Bioinformatics. 2006 Mar 7;7:113. doi: 10.1186/1471-2105-7-113.

PhyloGibbs: a Gibbs sampling motif finder that incorporates phylogeny.

PLoS Comput Biol. 2005 Dec;1(7):e67. doi: 10.1371/journal.pcbi.0010067. Epub 2005 Dec 9.

ROCR: visualizing classifier performance in R.

Bioinformatics. 2005 Oct 15;21(20):3940-1. doi: 10.1093/bioinformatics/bti623. Epub 2005 Aug 11.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

通过对直系同源启动子序列进行无比对和基于亲和力的分析来预测功能性转录因子结合

Predicting functional transcription factor binding through alignment-free and affinity-based analysis of orthologous promoter sequences.

作者信息

Ward Lucas D, Bussemaker Harmen J

机构信息

Department of Biological Sciences, Columbia University, New York, NY 10027, USA.