Laboratorio de Sistemas Complejos, Depto de Computación, FCEyN, Buenos Aires University, Buenos Aires, Argentina.
PLoS One. 2010 Oct 25;5(10):e13283. doi: 10.1371/journal.pone.0013283.
The vast computational resources that became available during the past decade enabled the development and simulation of increasingly complex mathematical models of cancer growth. These models typically involve many free parameters whose determination is a substantial obstacle to model development. Direct measurement of biochemical parameters in vivo is often difficult and sometimes impracticable, while fitting them under data-poor conditions may result in biologically implausible values.
We discuss different methodological approaches to estimate parameters in complex biological models. We make use of the high computational power of the Blue Gene technology to perform an extensive study of the parameter space in a model of avascular tumor growth. We explicitly show that the landscape of the cost function used to optimize the model to the data has a very rugged surface in parameter space. This cost function has many local minima with unrealistic solutions, including the global minimum corresponding to the best fit.
The case studied in this paper shows one example in which model parameters that optimally fit the data are not necessarily the best ones from a biological point of view. To avoid force-fitting a model to a dataset, we propose that the best model parameters should be found by choosing, among suboptimal parameters, those that match criteria other than the ones used to fit the model. We also conclude that the model, data and optimization approach form a new complex system and point to the need of a theory that addresses this problem more generally.
过去十年中出现的大量计算资源使得开发和模拟癌症生长的越来越复杂的数学模型成为可能。这些模型通常涉及许多自由参数,其确定是模型开发的一个重大障碍。在体内直接测量生化参数通常很困难,有时甚至不可行,而在数据匮乏的情况下对其进行拟合可能会导致生物学上不合理的值。
我们讨论了估计复杂生物模型中参数的不同方法。我们利用 Blue Gene 技术的强大计算能力,对无血管肿瘤生长模型的参数空间进行了广泛的研究。我们明确表明,用于将模型优化到数据的成本函数的景观在参数空间中具有非常崎岖的表面。该成本函数有许多具有不切实际解决方案的局部最小值,包括对应最佳拟合的全局最小值。
本文研究的案例表明了一个例子,即在数据上最佳拟合模型的参数不一定从生物学角度来看是最好的参数。为了避免将模型强行拟合到数据集,我们建议通过在次优参数中选择符合除拟合模型之外的标准的参数来找到最佳模型参数。我们还得出结论,模型、数据和优化方法形成了一个新的复杂系统,并指出需要一种更普遍地解决该问题的理论。