Velluet Jean, Noce Antonin Della, Letort Véronique
MICS Laboratory, CentraleSupelec, Paris-Saclay University, Gif-sur-Yvette, France.
Plant Phenomics. 2024 Feb 9;6:0133. doi: 10.34133/plantphenomics.0133. eCollection 2024.
Amid the rise of machine learning models, a substantial portion of plant growth models remains mechanistic, seeking to capture an in-depth understanding of the underlying phenomena governing the system's dynamics. The development of these models typically involves parameter estimation from experimental data. Ensuring that the estimated parameters align closely with their respective "true" values is crucial since they hold biological interpretation, leading to the challenge of uniqueness in the solutions. Structural identifiability analysis addresses this issue under the assumption of perfect observations of system dynamics, whereas practical identifiability considers limited measurements and the accompanying noise. In the literature, definitions for structural identifiability vary only slightly among authors, whereas the concept and quantification of practical identifiability lack consensus, with several indices coexisting. In this work, we provide a unified framework for studying identifiability, accommodating different definitions that need to be instantiated depending on each application case. In a more applicative second step, we focus on three widely used methods for quantifying practical identifiability: collinearity indices, profile likelihood, and average relative error. We show the limitations of their local versions, and we propose a new risk index built on the profile likelihood-based confidence intervals. We illustrate the usefulness of these concepts for plant growth modeling using a discrete-time individual plant growth model, LNAS, and a continuous-time plant population epidemics model. Through this work, we aim to underline the significance of identifiability analysis as a complement to any parameter estimation study and offer guidance to the modeler.
在机器学习模型兴起的背景下,相当一部分植物生长模型仍然是机理模型,旨在深入理解控制系统动态的潜在现象。这些模型的开发通常涉及从实验数据中进行参数估计。确保估计参数与其各自的“真实”值紧密对齐至关重要,因为它们具有生物学解释,这导致了解决方案的唯一性挑战。结构可识别性分析在系统动态完美观测的假设下解决了这个问题,而实际可识别性则考虑有限的测量和伴随的噪声。在文献中,结构可识别性的定义在作者之间仅有细微差异,而实际可识别性的概念和量化缺乏共识,存在多种指标并存的情况。在这项工作中,我们提供了一个统一的框架来研究可识别性,该框架适用于根据每个应用案例进行实例化的不同定义。在更具应用性的第二步中,我们专注于三种广泛用于量化实际可识别性的方法:共线性指标、轮廓似然和平均相对误差。我们展示了它们局部版本的局限性,并基于基于轮廓似然的置信区间提出了一种新的风险指标。我们使用离散时间单株植物生长模型LNAS和连续时间植物种群流行病模型来说明这些概念在植物生长建模中的有用性。通过这项工作,我们旨在强调可识别性分析作为任何参数估计研究补充的重要性,并为建模者提供指导。