Department of Epidemiology and Public Health, Imperial College Faculty of Medicine, London, United Kingdom.
PLoS One. 2010 Jan 27;5(1):e8915. doi: 10.1371/journal.pone.0008915.
Models for complex biological systems may involve a large number of parameters. It may well be that some of these parameters cannot be derived from observed data via regression techniques. Such parameters are said to be unidentifiable, the remaining parameters being identifiable. Closely related to this idea is that of redundancy, that a set of parameters can be expressed in terms of some smaller set. Before data is analysed it is critical to determine which model parameters are identifiable or redundant to avoid ill-defined and poorly convergent regression.
METHODOLOGY/PRINCIPAL FINDINGS: In this paper we outline general considerations on parameter identifiability, and introduce the notion of weak local identifiability and gradient weak local identifiability. These are based on local properties of the likelihood, in particular the rank of the Hessian matrix. We relate these to the notions of parameter identifiability and redundancy previously introduced by Rothenberg (Econometrica 39 (1971) 577-591) and Catchpole and Morgan (Biometrika 84 (1997) 187-196). Within the widely used exponential family, parameter irredundancy, local identifiability, gradient weak local identifiability and weak local identifiability are shown to be largely equivalent. We consider applications to a recently developed class of cancer models of Little and Wright (Math Biosciences 183 (2003) 111-134) and Little et al. (J Theoret Biol 254 (2008) 229-238) that generalize a large number of other recently used quasi-biological cancer models.
CONCLUSIONS/SIGNIFICANCE: We have shown that the previously developed concepts of parameter local identifiability and redundancy are closely related to the apparently weaker properties of weak local identifiability and gradient weak local identifiability--within the widely used exponential family these concepts largely coincide.
复杂生物系统的模型可能涉及大量参数。很可能其中一些参数无法通过回归技术从观测数据中推导出来。这些参数被称为不可识别的,其余参数是可识别的。与这个概念密切相关的是冗余性,即一组参数可以用一些较小的参数集来表示。在分析数据之前,确定哪些模型参数是可识别的或冗余的,以避免定义不明确和收敛不良的回归是至关重要的。
方法/主要发现:本文概述了参数可识别性的一般考虑因素,并引入了弱局部可识别性和梯度弱局部可识别性的概念。这些概念基于似然的局部性质,特别是海森矩阵的秩。我们将这些概念与罗滕伯格(Econometrica 39(1971)577-591)和卡奇波尔和摩根(Biometrika 84(1997)187-196)之前引入的参数可识别性和冗余性概念联系起来。在广泛使用的指数族中,参数不可约、局部可识别、梯度弱局部可识别和弱局部可识别在很大程度上是等效的。我们考虑了最近开发的一类癌症模型的应用,该模型由利特和赖特(Math Biosciences 183(2003)111-134)和利特等人开发(J Theoret Biol 254(2008)229-238),该模型广泛应用于许多其他最近开发的准生物癌症模型。
结论/意义:我们已经表明,以前开发的参数局部可识别性和冗余性概念与弱局部可识别性和梯度弱局部可识别性的较弱属性密切相关——在广泛使用的指数族中,这些概念在很大程度上是一致的。