Luo Songhao, Zhang Zhenquan, Wang Zihao, Yang Xiyan, Chen Xiaoxuan, Zhou Tianshou, Zhang Jiajun
Guangdong Province Key Laboratory of Computational Science, Sun Yat-sen University, Guangzhou, Guangdong Province 510275, People's Republic of China.
School of Mathematics, Sun Yat-sen University, Guangzhou, Guangdong Province 510275, People's Republic of China.
R Soc Open Sci. 2023 Apr 5;10(4):221057. doi: 10.1098/rsos.221057. eCollection 2023 Apr.
Gene expression has inherent stochasticity resulting from transcription's burst manners. Single-cell snapshot data can be exploited to rigorously infer transcriptional burst kinetics, using mathematical models as blueprints. The classical telegraph model (CTM) has been widely used to explain transcriptional bursting with Markovian assumptions. However, growing evidence suggests that the gene-state dwell times are generally non-exponential, as gene-state switching is a multi-step process in organisms. Therefore, interpretable non-Markovian mathematical models and efficient statistical inference methods are urgently required in investigating transcriptional burst kinetics. We develop an interpretable and tractable model, the generalized telegraph model (GTM), to characterize transcriptional bursting that allows arbitrary dwell-time distributions, rather than exponential distributions, to be incorporated into the ON and OFF switching process. Based on the GTM, we propose an inference method for transcriptional bursting kinetics using an approximate Bayesian computation framework. This method demonstrates an efficient and scalable estimation of burst frequency and burst size on synthetic data. Further, the application of inference to genome-wide data from mouse embryonic fibroblasts reveals that GTM would estimate lower burst frequency and higher burst size than those estimated by CTM. In conclusion, the GTM and the corresponding inference method are effective tools to infer dynamic transcriptional bursting from static single-cell snapshot data.
基因表达具有由转录的爆发方式所导致的内在随机性。单细胞快照数据可利用数学模型作为蓝本,来严格推断转录爆发动力学。经典电报模型(CTM)已被广泛用于在马尔可夫假设下解释转录爆发。然而,越来越多的证据表明,基因状态驻留时间通常是非指数性的,因为基因状态转换在生物体中是一个多步骤过程。因此,在研究转录爆发动力学时,迫切需要可解释的非马尔可夫数学模型和有效的统计推断方法。我们开发了一个可解释且易于处理的模型——广义电报模型(GTM),以表征转录爆发,该模型允许将任意驻留时间分布而非指数分布纳入开启和关闭转换过程。基于GTM,我们提出了一种使用近似贝叶斯计算框架推断转录爆发动力学的方法。该方法在合成数据上展示了对爆发频率和爆发大小的高效且可扩展的估计。此外,将推断应用于来自小鼠胚胎成纤维细胞的全基因组数据表明,与CTM估计的结果相比,GTM会估计出更低的爆发频率和更高的爆发大小。总之,GTM和相应的推断方法是从静态单细胞快照数据推断动态转录爆发的有效工具。