Jeong Jenny E, Qiu Peng
Electrical and Computer Engineering, Georgia Institute of Technology, Atlanta, 30332, GA, USA.
Biomedical Engineering, Georgia Institute of Technology and Emory University, Atlanta, 30332, GA, USA.
BMC Syst Biol. 2018 Nov 22;12(Suppl 6):103. doi: 10.1186/s12918-018-0622-6.
Ordinary differential equations (ODEs) are often used to understand biological processes. Since ODE-based models usually contain many unknown parameters, parameter estimation is an important step toward deeper understanding of the process. Parameter estimation is often formulated as a least squares optimization problem, where all experimental data points are considered as equally important. However, this equal-weight formulation ignores the possibility of existence of relative importance among different data points, and may lead to misleading parameter estimation results. Therefore, we propose to introduce weights to account for the relative importance of different data points when formulating the least squares optimization problem. Each weight is defined by the uncertainty of one data point given the other data points. If one data point can be accurately inferred given the other data, the uncertainty of this data point is low and the importance of this data point is low. Whereas, if inferring one data point from the other data is almost impossible, it contains a huge uncertainty and carries more information for estimating parameters.
G1/S transition model with 6 parameters and 12 parameters, and MAPK module with 14 parameters were used to test the weighted formulation. In each case, evenly spaced experimental data points were used. Weights calculated in these models showed similar patterns: high weights for data points in dynamic regions and low weights for data points in flat regions. We developed a sampling algorithm to evaluate the weighted formulation, and demonstrated that the weighted formulation reduced the redundancy in the data. For G1/S transition model with 12 parameters, we examined unevenly spaced experimental data points, strategically sampled to have more measurement points where the weights were relatively high, and fewer measurement points where the weights were relatively low. This analysis showed that the proposed weights can be used for designing measurement time points.
Giving a different weight to each data point according to its relative importance compared to other data points is an effective method for improving robustness of parameter estimation by reducing the redundancy in the experimental data.
常微分方程(ODEs)常用于理解生物过程。由于基于ODE的模型通常包含许多未知参数,参数估计是深入理解该过程的重要一步。参数估计通常被表述为一个最小二乘优化问题,其中所有实验数据点都被视为同等重要。然而,这种等权重表述忽略了不同数据点之间存在相对重要性的可能性,可能会导致误导性的参数估计结果。因此,我们建议在制定最小二乘优化问题时引入权重,以考虑不同数据点的相对重要性。每个权重由一个数据点相对于其他数据点的不确定性定义。如果给定其他数据可以准确推断出一个数据点,那么这个数据点的不确定性就低,其重要性也低。相反,如果从其他数据推断出一个数据点几乎是不可能的,那么它就包含巨大的不确定性,并且在估计参数时携带更多信息。
使用具有6个参数和12个参数的G1/S转换模型以及具有14个参数的MAPK模块来测试加权表述。在每种情况下,都使用了等间距的实验数据点。这些模型中计算出的权重显示出相似的模式:动态区域中的数据点权重高,平坦区域中的数据点权重低。我们开发了一种采样算法来评估加权表述,并证明加权表述减少了数据中的冗余。对于具有12个参数的G1/S转换模型,我们检查了非等间距的实验数据点,进行了策略性采样,以便在权重相对较高的地方有更多测量点,而在权重相对较低的地方有较少测量点。该分析表明,所提出的权重可用于设计测量时间点。
根据每个数据点相对于其他数据点的相对重要性赋予不同的权重,是通过减少实验数据中的冗余来提高参数估计稳健性的有效方法。