Cevik Mucahit, Angco Sabrina, Heydarigharaei Elham, Jahanshahi Hadi, Prayogo Nicholas
Toronto Metropolitan University, 44 Gerrard St E, Toronto, M5B 1G3 Ontario Canada.
J Healthc Inform Res. 2022 Jul 15;6(3):317-343. doi: 10.1007/s41666-022-00117-y. eCollection 2022 Sep.
Sensitivity analysis is an important aspect of model development as it can be used to assess the level of confidence that is associated with the outcomes of a study. In many practical problems, sensitivity analysis involves evaluating a large number of parameter combinations which may require an extensive amount of time and resources. However, such a computational burden can be avoided by identifying smaller subsets of parameter combinations that can be later used to generate the desired outcomes for other parameter combinations. In this study, we investigate machine learning-based approaches for speeding up the sensitivity analysis. Furthermore, we apply feature selection methods to identify the relative importance of quantitative model parameters in terms of their predictive ability on the outcomes. Finally, we highlight the effectiveness of active learning strategies in improving the sensitivity analysis processes by reducing the total number of quantitative model runs required to construct a high-performance prediction model. Our experiments on two datasets obtained from the sensitivity analysis performed for two disease screening modeling studies indicate that ensemble methods such as Random Forests and XGBoost consistently outperform other machine learning algorithms in the prediction task of the associated sensitivity analysis. In addition, we note that active learning can lead to significant speed-ups in sensitivity analysis by enabling the selection of more useful parameter combinations (i.e., instances) to be used for prediction models.
敏感性分析是模型开发的一个重要方面,因为它可用于评估与研究结果相关的置信水平。在许多实际问题中,敏感性分析涉及评估大量的参数组合,这可能需要大量的时间和资源。然而,通过识别较小的参数组合子集,随后可用于为其他参数组合生成所需结果,这样的计算负担是可以避免的。在本研究中,我们研究基于机器学习的方法来加速敏感性分析。此外,我们应用特征选择方法来确定定量模型参数在对结果的预测能力方面的相对重要性。最后,我们强调主动学习策略通过减少构建高性能预测模型所需的定量模型运行总数来提高敏感性分析过程的有效性。我们对从两项疾病筛查建模研究的敏感性分析中获得的两个数据集进行的实验表明,诸如随机森林和XGBoost等集成方法在相关敏感性分析的预测任务中始终优于其他机器学习算法。此外,我们注意到主动学习可以通过选择更有用的参数组合(即实例)用于预测模型,从而显著加快敏感性分析的速度。