Golubev A
Saint-Petersburg State University, Saint Petersburg, Russia.
Comput Math Methods Med. 2017;2017:7925106. doi: 10.1155/2017/7925106. Epub 2017 Jun 5.
In many cases relevant to biomedicine, a variable time, which features a certain distribution, is required for objects of interest to pass from an initial to an intermediate state, out of which they exit at random to a final state. In such cases, the distribution of variable times between exiting the initial and entering the final state must conform to the convolution of the first distribution and a negative exponential distribution. A common example is the exponentially modified Gaussian (EMG), which is widely used in chromatography for peak analysis and is long known as ex-Gaussian in psychophysiology, where it is applied to times from stimulus to response. In molecular and cell biology, EMG, compared with commonly used simple distributions, such as lognormal, gamma, and Wald, provides better fits to the variabilities of times between consecutive cell divisions and transcriptional bursts and has more straightforwardly interpreted parameters. However, since the range of definition of the Gaussian component of EMG is unlimited, data approximation with EMG may extend to the negative domain. This extension may seem negligible when the coefficient of variance of the Gaussian component is small but becomes considerable when the coefficient increases. Therefore, although in many cases an EMG may be an acceptable approximation of data, an exponentially modified nonnegative peak function, such as gamma-distribution, can make more sense in physical terms. In the present short review, EMG and exponentially modified gamma-distribution (EMGD) are discussed with regard to their applicability to data on cell cycle, gene expression, physiological responses to stimuli, and other cases, some of which may be interpreted as decision-making. In practical fitting terms, EMG and EMGD are equivalent in outperforming other functions; however, when the coefficient of variance of the Gaussian component of EMG is greater than ca. 0.4, EMGD is preferable.
在许多与生物医学相关的情况下,感兴趣的对象从初始状态转变为中间状态需要一段具有特定分布的可变时间,然后它们会随机退出中间状态进入最终状态。在这种情况下,从退出初始状态到进入最终状态之间的可变时间分布必须符合第一种分布与负指数分布的卷积。一个常见的例子是指数修正高斯分布(EMG),它在色谱分析的峰分析中被广泛使用,在心理生理学中早就被称为前高斯分布,用于表示从刺激到反应的时间。在分子和细胞生物学中,与常用的简单分布(如对数正态分布、伽马分布和瓦尔德分布)相比,EMG能更好地拟合连续细胞分裂和转录爆发之间时间的变异性,并且其参数的解释更直接。然而,由于EMG高斯分量的定义范围是无限的,用EMG进行数据近似可能会延伸到负域。当高斯分量的方差系数较小时,这种延伸可能看似微不足道,但当系数增加时就会变得相当可观。因此,尽管在许多情况下EMG可能是数据的可接受近似,但从物理意义上讲,指数修正非负峰函数(如伽马分布)可能更有意义。在本简短综述中,讨论了EMG和指数修正伽马分布(EMGD)在细胞周期、基因表达、对刺激的生理反应及其他一些可解释为决策的情况的数据适用性。在实际拟合方面,EMG和EMGD在性能上优于其他函数;然而,当EMG高斯分量的方差系数大于约0.4时,EMGD更可取。