Faculty of Mathematics and Natural Sciences, Rheinische Friedrich-Wilhelms-Universität Bonn, Bonn, Germany.
Institute of Computational Biology, Helmholtz Zentrum München, Neuherberg, Germany.
PLoS One. 2023 May 22;18(5):e0285836. doi: 10.1371/journal.pone.0285836. eCollection 2023.
Calibrating model parameters on heterogeneous data can be challenging and inefficient. This holds especially for likelihood-free methods such as approximate Bayesian computation (ABC), which rely on the comparison of relevant features in simulated and observed data and are popular for otherwise intractable problems. To address this problem, methods have been developed to scale-normalize data, and to derive informative low-dimensional summary statistics using inverse regression models of parameters on data. However, while approaches only correcting for scale can be inefficient on partly uninformative data, the use of summary statistics can lead to information loss and relies on the accuracy of employed methods. In this work, we first show that the combination of adaptive scale normalization with regression-based summary statistics is advantageous on heterogeneous parameter scales. Second, we present an approach employing regression models not to transform data, but to inform sensitivity weights quantifying data informativeness. Third, we discuss problems for regression models under non-identifiability, and present a solution using target augmentation. We demonstrate improved accuracy and efficiency of the presented approach on various problems, in particular robustness and wide applicability of the sensitivity weights. Our findings demonstrate the potential of the adaptive approach. The developed algorithms have been made available in the open-source Python toolbox pyABC.
在异质数据上校准模型参数可能具有挑战性且效率低下。对于似然自由方法(例如近似贝叶斯计算(ABC))尤其如此,这些方法依赖于模拟和观察数据中相关特征的比较,并且对于其他难以处理的问题很受欢迎。为了解决这个问题,已经开发了一些方法来对数据进行缩放归一化,并使用参数在数据上的逆回归模型来推导出信息丰富的低维摘要统计信息。然而,虽然仅纠正尺度的方法在部分无信息数据上可能效率低下,但使用摘要统计信息可能会导致信息丢失,并且依赖于所使用方法的准确性。在这项工作中,我们首先表明,自适应缩放归一化与基于回归的摘要统计信息的组合在异构参数尺度上具有优势。其次,我们提出了一种使用回归模型的方法,不是转换数据,而是告知量化数据信息量的敏感权重。第三,我们讨论了非可识别性下回归模型的问题,并提出了使用目标增强的解决方案。我们在各种问题上展示了所提出方法的准确性和效率的提高,特别是敏感权重的稳健性和广泛适用性。我们的研究结果证明了自适应方法的潜力。所开发的算法已在开源 Python 工具包 pyABC 中提供。