一种用于发现转录因子结合位点的生物物理方法。

A biophysical approach to transcription factor binding site discovery.

作者信息

Djordjevic Marko, Sengupta Anirvan M, Shraiman Boris I

机构信息

Department of Physics, Columbia University, New York, New York 10025, USA.

出版信息

Genome Res. 2003 Nov;13(11):2381-90. doi: 10.1101/gr.1271603.

DOI:10.1101/gr.1271603

PMID:14597652

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC403756/

Abstract

Identification of transcription factor binding sites within regulatory segments of genomic DNA is an important step toward understanding of the regulatory circuits that control expression of genes. Here, we describe a novel bioinformatics method that bases classification of potential binding sites explicitly on the estimate of sequence-specific binding energy of a given transcription factor. The method also estimates the chemical potential of the factor that defines the threshold of binding. In contrast with the widely used information-theoretic weight matrix method, the new approach correctly describes saturation in the transcription factor/DNA binding probability. This results in a significant improvement in the number of expected false positives, particularly in the ubiquitous case of low-specificity factors. In the strong binding limit, the algorithm is related to the "support vector machine" approach to pattern recognition. The new method is used to identify likely genomic binding sites for the E. coli transcription factors collected in the DPInteract database. In addition, for CRP (a global regulatory factor), the likely regulatory modality (i.e., repressor or activator) of predicted binding sites is determined.

摘要

识别基因组DNA调控片段中的转录因子结合位点是理解控制基因表达的调控回路的重要一步。在此，我们描述了一种新的生物信息学方法，该方法明确基于给定转录因子的序列特异性结合能估计对潜在结合位点进行分类。该方法还估计了定义结合阈值的因子的化学势。与广泛使用的信息论权重矩阵方法不同，新方法正确描述了转录因子/DNA结合概率中的饱和度。这导致预期假阳性数量显著减少，特别是在低特异性因子普遍存在的情况下。在强结合极限下，该算法与模式识别中的“支持向量机”方法相关。新方法用于识别DPInteract数据库中收集的大肠杆菌转录因子可能的基因组结合位点。此外，对于CRP（一种全局调控因子），确定了预测结合位点可能的调控方式（即阻遏物或激活物）。

相似文献

A biophysical approach to transcription factor binding site discovery.一种用于发现转录因子结合位点的生物物理方法。

Genome Res. 2003 Nov;13(11):2381-90. doi: 10.1101/gr.1271603.

Integrating genomic data to predict transcription factor binding.整合基因组数据以预测转录因子结合

Genome Inform. 2005;16(1):83-94.

MAPPER: a search engine for the computational identification of putative transcription factor binding sites in multiple genomes.MAPPER：一种用于在多个基因组中通过计算识别潜在转录因子结合位点的搜索引擎。

BMC Bioinformatics. 2005 Mar 30;6:79. doi: 10.1186/1471-2105-6-79.

A motif co-occurrence approach for genome-wide prediction of transcription-factor-binding sites in Escherichia coli.一种用于全基因组预测大肠杆菌中转录因子结合位点的基序共现方法。

Genome Res. 2004 Feb;14(2):201-8. doi: 10.1101/gr.1448004.

Molecular analysis of the regulation of csiD, a carbon starvation-inducible gene in Escherichia coli that is exclusively dependent on sigma s and requires activation by cAMP-CRP.大肠杆菌中碳饥饿诱导基因csiD调控的分子分析，该基因完全依赖于σS且需要cAMP-CRP激活。

J Mol Biol. 1998 Feb 20;276(2):339-53. doi: 10.1006/jmbi.1997.1533.

A comprehensive library of DNA-binding site matrices for 55 proteins applied to the complete Escherichia coli K-12 genome.一个适用于完整大肠杆菌K-12基因组的包含55种蛋白质的DNA结合位点矩阵的综合文库。

J Mol Biol. 1998 Nov 27;284(2):241-54. doi: 10.1006/jmbi.1998.2160.

Constrained binding site diversity within families of transcription factors enhances pattern discovery bioinformatics.转录因子家族内受限的结合位点多样性增强了模式发现生物信息学。

J Mol Biol. 2004 Apr 23;338(2):207-15. doi: 10.1016/j.jmb.2004.02.048.

Tandem machine learning for the identification of genes regulated by transcription factors.用于识别受转录因子调控基因的串联机器学习

BMC Bioinformatics. 2005 Aug 22;6:204. doi: 10.1186/1471-2105-6-204.

Computer-assisted identification of cell cycle-related genes: new targets for E2F transcription factors.细胞周期相关基因的计算机辅助鉴定：E2F转录因子的新靶点

J Mol Biol. 2001 May 25;309(1):99-120. doi: 10.1006/jmbi.2001.4650.

Using sequence-specific chemical and structural properties of DNA to predict transcription factor binding sites.利用 DNA 的序列特异性化学和结构特性来预测转录因子结合位点。

PLoS Comput Biol. 2010 Nov 18;6(11):e1001007. doi: 10.1371/journal.pcbi.1001007.

引用本文的文献

BERT-TFBS: a novel BERT-based model for predicting transcription factor binding sites by transfer learning.BERT-TFBS：一种基于迁移学习的用于预测转录因子结合位点的新型基于BERT的模型。

Brief Bioinform. 2024 Mar 27;25(3). doi: 10.1093/bib/bbae195.

A Bayesian noisy logic model for inference of transcription factor activity from single cell and bulk transcriptomic data.一种用于从单细胞和批量转录组数据推断转录因子活性的贝叶斯噪声逻辑模型。

NAR Genom Bioinform. 2023 Dec 13;5(4):lqad106. doi: 10.1093/nargab/lqad106. eCollection 2023 Dec.

Identifying functional regulatory mutation blocks by integrating genome sequencing and transcriptome data.通过整合基因组测序和转录组数据来识别功能性调控突变块。

iScience. 2023 Jul 3;26(8):107266. doi: 10.1016/j.isci.2023.107266. eCollection 2023 Aug 18.

Quality assessment and refinement of chromatin accessibility data using a sequence-based predictive model.利用基于序列的预测模型评估和优化染色质可及性数据的质量。

Proc Natl Acad Sci U S A. 2022 Dec 20;119(51):e2212810119. doi: 10.1073/pnas.2212810119. Epub 2022 Dec 12.

Anticipating response function in gene regulatory networks.预测基因调控网络中的响应函数。

J R Soc Interface. 2021 Jun;18(179):20210206. doi: 10.1098/rsif.2021.0206. Epub 2021 Jun 2.

DeepGRN: prediction of transcription factor binding site across cell-types using attention-based deep neural networks.DeepGRN：基于注意力机制的深度神经网络跨细胞类型预测转录因子结合位点

BMC Bioinformatics. 2021 Feb 1;22(1):38. doi: 10.1186/s12859-020-03952-1.

IGAP-integrative genome analysis pipeline reveals new gene regulatory model associated with nonspecific TF-DNA binding affinity.IGAP整合基因组分析流程揭示了与非特异性转录因子- DNA结合亲和力相关的新基因调控模型。

Comput Struct Biotechnol J. 2020 Jun 2;18:1270-1286. doi: 10.1016/j.csbj.2020.05.024. eCollection 2020.

Diffusion of DNA-Binding Species in the Nucleus: A Transient Anomalous Subdiffusion Model.DNA 结合物种在核内的扩散：一种瞬态异常亚扩散模型。

Biophys J. 2020 May 5;118(9):2151-2167. doi: 10.1016/j.bpj.2020.03.015. Epub 2020 Apr 4.

Causal Inference Engine: a platform for directional gene set enrichment analysis and inference of active transcriptional regulators.因果推理引擎：一个用于有向基因集富集分析和推断活性转录调控因子的平台。

Nucleic Acids Res. 2019 Dec 16;47(22):11563-11573. doi: 10.1093/nar/gkz1046.

BayesPI-BAR2: A New Python Package for Predicting Functional Non-coding Mutations in Cancer Patient Cohorts.BayesPI-BAR2：用于预测癌症患者队列中功能性非编码突变的新Python软件包。

Front Genet. 2019 Apr 2;10:282. doi: 10.3389/fgene.2019.00282. eCollection 2019.

本文引用的文献

Nucleotides of transcription factor binding sites exert interdependent effects on the binding affinities of transcription factors.转录因子结合位点的核苷酸对转录因子的结合亲和力产生相互依赖的影响。

Nucleic Acids Res. 2002 Mar 1;30(5):1255-61. doi: 10.1093/nar/30.5.1255.

Specificity and robustness in transcription control networks.转录控制网络中的特异性和稳健性。

Proc Natl Acad Sci U S A. 2002 Feb 19;99(4):2072-7. doi: 10.1073/pnas.022388499.

Discovery and modeling of transcriptional regulatory regions.转录调控区域的发现与建模

Curr Opin Biotechnol. 2000 Feb;11(1):19-24. doi: 10.1016/s0958-1669(99)00049-x.

Multiple control of flagellum biosynthesis in Escherichia coli: role of H-NS protein and the cyclic AMP-catabolite activator protein complex in transcription of the flhDC master operon.大肠杆菌中鞭毛生物合成的多重调控：H-NS蛋白和环腺苷酸-分解代谢物激活蛋白复合物在flhDC主操纵子转录中的作用

J Bacteriol. 1999 Dec;181(24):7500-8. doi: 10.1128/JB.181.24.7500-7508.1999.

Transcription activation by catabolite activator protein (CAP).分解代谢物激活蛋白（CAP）介导的转录激活

J Mol Biol. 1999 Oct 22;293(2):199-213. doi: 10.1006/jmbi.1999.3161.

Thermoregulation of Shigella and Escherichia coli EIEC pathogenicity. A temperature-dependent structural transition of DNA modulates accessibility of virF promoter to transcriptional repressor H-NS.志贺氏菌和肠侵袭性大肠杆菌致病性的温度调节。DNA的温度依赖性结构转变调节virF启动子对转录阻遏物H-NS的可及性。

EMBO J. 1998 Dec 1;17(23):7033-43. doi: 10.1093/emboj/17.23.7033.

J Mol Biol. 1998 Nov 27;284(2):241-54. doi: 10.1006/jmbi.1998.2160.

Specificity, free energy and information content in protein-DNA interactions.蛋白质 - DNA 相互作用中的特异性、自由能和信息含量。

Trends Biochem Sci. 1998 Mar;23(3):109-13. doi: 10.1016/s0968-0004(98)01187-6.

The complete genome sequence of Escherichia coli K-12.大肠杆菌K-12的全基因组序列。

Science. 1997 Sep 5;277(5331):1453-62. doi: 10.1126/science.277.5331.1453.

Transcriptional regulation by cAMP and its receptor protein.环磷酸腺苷（cAMP）及其受体蛋白的转录调控

Annu Rev Biochem. 1993;62:749-95. doi: 10.1146/annurev.bi.62.070193.003533.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。