Department of Biochemistry and Molecular Biology, University of the Basque Country, Bilbao, Spain.
Biophys J. 2010 Oct 20;99(8):2408-13. doi: 10.1016/j.bpj.2010.08.006.
Gene regulation involves a hierarchy of events that extend from specific protein-DNA interactions to the combinatorial assembly of nucleoprotein complexes. The effects of DNA sequence on these processes have typically been studied based either on its quantitative connection with single-domain binding free energies or on empirical rules that combine different DNA motifs to predict gene expression trends on a genomic scale. The middle-point approach that quantitatively bridges these two extremes, however, remains largely unexplored. Here, we provide an integrated approach to accurately predict gene expression from statistical sequence information in combination with detailed biophysical modeling of transcription regulation by multidomain binding on multiple DNA sites. For the regulation of the prototypical lac operon, this approach predicts within 0.3-fold accuracy transcriptional activity over a 10,000-fold range from DNA sequence statistics for different intracellular conditions.
基因调控涉及一系列事件,从特定的蛋白质-DNA 相互作用扩展到核蛋白复合物的组合组装。DNA 序列对这些过程的影响通常基于其与单域结合自由能的定量关系,或者基于组合不同 DNA 基序以预测基因组范围内基因表达趋势的经验规则来研究。然而,定量连接这两个极端的中点方法在很大程度上仍未得到探索。在这里,我们提供了一种综合方法,可结合转录调控的多维结合对多个 DNA 位点的详细生物物理建模,从统计序列信息中准确预测基因表达。对于原型 lac 操纵子的调控,该方法可在不同细胞内条件下从 DNA 序列统计信息预测 10,000 倍范围内的转录活性,准确度在 0.3 倍以内。