Department of Physics, California Institute of Technology, Pasadena, California, USA; email:
Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, California, USA.
Annu Rev Biophys. 2019 May 6;48:121-163. doi: 10.1146/annurev-biophys-052118-115525.
It is tempting to believe that we now own the genome. The ability to read and rewrite it at will has ushered in a stunning period in the history of science. Nonetheless, there is an Achilles' heel exposed by all of the genomic data that has accrued: We still do not know how to interpret them. Many genes are subject to sophisticated programs of transcriptional regulation, mediated by DNA sequences that harbor binding sites for transcription factors, which can up- or down-regulate gene expression depending upon environmental conditions. This gives rise to an input-output function describing how the level of expression depends upon the parameters of the regulated gene-for instance, on the number and type of binding sites in its regulatory sequence. In recent years, the ability to make precision measurements of expression, coupled with the ability to make increasingly sophisticated theoretical predictions, has enabled an explicit dialogue between theory and experiment that holds the promise of covering this genomic Achilles' heel. The goal is to reach a predictive understanding of transcriptional regulation that makes it possible to calculate gene expression levels from DNA regulatory sequence. This review focuses on the canonical simple repression motif to ask how well the models that have been used to characterize it actually work. We consider a hierarchy of increasingly sophisticated experiments in which the minimal parameter set learned at one level is applied to make quantitative predictions at the next. We show that these careful quantitative dissections provide a template for a predictive understanding of the many more complex regulatory arrangements found across all domains of life.
人们很容易相信,我们现在已经拥有了基因组。随心所欲地读取和改写它的能力开创了科学史上令人惊叹的时期。尽管如此,所有积累的基因组数据都暴露出一个阿喀琉斯之踵:我们仍然不知道如何解释它们。许多基因受到转录调控复杂程序的影响,这些程序由 DNA 序列介导,这些序列包含转录因子的结合位点,转录因子可以根据环境条件上调或下调基因表达。这就产生了一个输入-输出函数,描述了表达水平如何取决于受调控基因的参数,例如,其调节序列中的结合位点的数量和类型。近年来,精确测量表达水平的能力,以及制作越来越复杂的理论预测的能力,使得理论和实验之间能够进行明确的对话,有望克服这个基因组阿喀琉斯之踵。目标是实现对转录调控的预测性理解,从而能够从 DNA 调控序列计算基因表达水平。这篇综述集中讨论了经典的简单抑制基序,以询问用于对其进行特征描述的模型实际上的工作效果如何。我们考虑了一系列越来越复杂的实验,其中在一个层次上学习的最小参数集应用于下一个层次进行定量预测。我们表明,这些仔细的定量分析为预测理解在所有生命领域中发现的许多更复杂的调控安排提供了模板。