Faculty of Computer and Information Science, University of Ljubljana, Večna pot 113, Ljubljana, 1000, Slovenia.
Faculty of Mathematics and Physics, University of Ljubljana, Jadranska ulica 19, Ljubljana, 1000, Slovenia.
BMC Bioinformatics. 2018 Sep 21;19(1):333. doi: 10.1186/s12859-018-2366-0.
Data-driven methods that automatically learn relations between attributes from given data are a popular tool for building mathematical models in computational biology. Since measurements are prone to errors, approaches dealing with uncertain data are especially suitable for this task. Fuzzy models are one such approach, but they contain a large amount of parameters and are thus susceptible to over-fitting. Validation methods that help detect over-fitting are therefore needed to eliminate inaccurate models.
We propose a method to enlarge the validation datasets on which a fuzzy dynamic model of a cellular network can be tested. We apply our method to two data-driven dynamic models of the MAPK signalling pathway and two models of the mammalian circadian clock. We show that random initial state perturbations can drastically increase the mean error of predictions of an inaccurate computational model, while keeping errors of predictions of accurate models small.
With the improvement of validation methods, fuzzy models are becoming more accurate and are thus likely to gain new applications. This field of research is promising not only because fuzzy models can cope with uncertainty, but also because their run time is short compared to conventional modelling methods that are nowadays used in systems biology.
从给定数据中自动学习属性之间关系的数据驱动方法是构建计算生物学数学模型的一种流行工具。由于测量容易出错,因此处理不确定数据的方法特别适合这项任务。模糊模型就是这样一种方法,但它包含大量参数,因此容易过度拟合。因此,需要验证方法来帮助检测过度拟合,以消除不准确的模型。
我们提出了一种方法来扩大模糊细胞网络动态模型可以测试的验证数据集。我们将我们的方法应用于两个基于数据的 MAPK 信号通路的动态模型和两个哺乳动物生物钟模型。我们表明,随机初始状态扰动可以极大地增加不准确计算模型预测的平均误差,同时保持准确模型的预测误差较小。
随着验证方法的改进,模糊模型变得更加准确,因此可能会获得新的应用。这个研究领域很有前途,不仅因为模糊模型可以处理不确定性,而且还因为与当今在系统生物学中使用的传统建模方法相比,它们的运行时间较短。