Gladstone Institute of Data Science and Biotechnology, Gladstone Institutes, San Francisco, CA, United States of America.
Phys Biol. 2020 Sep 28;17(6):066001. doi: 10.1088/1478-3975/aba50f.
Fitting the probability mass functions from analytical solutions of stochastic models of gene expression to the single-cell count distributions of mRNA and protein molecules can yield valuable insights into mechanisms underlying gene expression. Solutions of chemical master equations are available for various kinetic schemes but, even for the basic ON-OFF genetic switch, they take complex forms with generating functions given as hypergeometric functions. Interpretation of gene expression dynamics in terms of bursts is not consistent with the complete range of parameters for these functions. Physical insights into the probability mass functions are essential to ensure proper interpretations but are lacking for models considering genetic switches. To fill this gap, we develop urn models for stochastic gene expression. We sample RNA polymerases or ribosomes from a master urn, which represents the cytosol, and assign them to recipient urns of two or more colors, which represent time intervals in which no switching occurs. Colors of the recipient urns represent sub-systems of the promoter states, and the assignments to urns of a specific color represent gene expression. We use elementary principles of discrete probability theory to solve a range of kinetic models without feedback, including the Peccoud-Ycart model, the Shahrezaei-Swain model, and models with an arbitrary number of promoter states. In the last case, we obtain a novel result for the protein distribution. For activated genes, we show that transcriptional lapses, which are events of gene inactivation for short time intervals separated by long active intervals, quantify the transcriptional dynamics better than bursts. We show that the intuition gained from our urn models may also be useful in understanding existing solutions for models with feedback. We contrast our models with urn models for related distributions, discuss a generalization of the Delaporte distribution for single-cell data analysis, and highlight the limitations of our models.
将随机基因表达模型的解析解的概率质量函数拟合到 mRNA 和蛋白质分子的单细胞计数分布,可以深入了解基因表达的机制。各种动力学方案的化学主方程都有解,但即使对于基本的 ON-OFF 遗传开关,它们也具有复杂的形式,生成函数表示为超几何函数。根据爆发解释基因表达动力学与这些函数的完整参数范围不一致。为了确保正确的解释,对概率质量函数进行物理理解是必不可少的,但对于考虑遗传开关的模型来说,这方面的理解是缺乏的。为了填补这一空白,我们开发了用于随机基因表达的 urn 模型。我们从代表细胞质的主 urn 中抽取 RNA 聚合酶或核糖体,并将它们分配到没有切换发生的两个或更多颜色的接收 urn 中。接收 urn 的颜色代表启动子状态的子系统,而分配到特定颜色的 urn 代表基因表达。我们使用离散概率论的基本原理来解决一系列没有反馈的动力学模型,包括 Peccoud-Ycart 模型、Shahrezaei-Swain 模型和任意数量的启动子状态模型。在后一种情况下,我们得到了蛋白质分布的一个新结果。对于激活的基因,我们表明转录间隙,即短时间基因失活事件,通过长时间的活跃间隔隔开,比爆发更好地量化了转录动力学。我们表明,从我们的 urn 模型中获得的直觉也可能有助于理解具有反馈的模型的现有解决方案。我们将我们的模型与相关分布的 urn 模型进行了对比,讨论了用于单细胞数据分析的 Delaporte 分布的推广,并强调了我们模型的局限性。