Health Economics and Outcome Research, Exact Sciences Corporation, Madison, WI, USA.
Departments of Industrial & Systems Engineering and Population Health Sciences, University of Wisconsin-Madison, Madison, WI, USA.
Med Decis Making. 2023 Aug;43(6):719-736. doi: 10.1177/0272989X231184175. Epub 2023 Jul 11.
Machine learning (ML)-based emulators improve the calibration of decision-analytical models, but their performance in complex microsimulation models is yet to be determined.
We demonstrated the use of an ML-based emulator with the Colorectal Cancer (CRC)-Adenoma Incidence and Mortality (CRC-AIM) model, which includes 23 unknown natural history input parameters to replicate the CRC epidemiology in the United States. We first generated 15,000 input combinations and ran the CRC-AIM model to evaluate CRC incidence, adenoma size distribution, and the percentage of small adenoma detected by colonoscopy. We then used this data set to train several ML algorithms, including deep neural network (DNN), random forest, and several gradient boosting variants (i.e., XGBoost, LightGBM, CatBoost) and compared their performance. We evaluated 10 million potential input combinations using the selected emulator and examined input combinations that best estimated observed calibration targets. Furthermore, we cross-validated outcomes generated by the CRC-AIM model with those made by CISNET models. The calibrated CRC-AIM model was externally validated using the United Kingdom Flexible Sigmoidoscopy Screening Trial (UKFSST).
The DNN with proper preprocessing outperformed other tested ML algorithms and successfully predicted all 8 outcomes for different input combinations. It took 473 s for the trained DNN to predict outcomes for 10 million inputs, which would have required 190 CPU-years without our DNN. The overall calibration process took 104 CPU-days, which included building the data set, training, selecting, and hyperparameter tuning of the ML algorithms. While 7 input combinations had acceptable fit to the targets, a combination that best fits all outcomes was selected as the best vector. Almost all of the predictions made by the best vector laid within those from the CISNET models, demonstrating CRC-AIM's cross-model validity. Similarly, CRC-AIM accurately predicted the hazard ratios of CRC incidence and mortality as reported by UKFSST, demonstrating its external validity. Examination of the impact of calibration targets suggested that the selection of the calibration target had a substantial impact on model outcomes in terms of life-year gains with screening.
Emulators such as a DNN that is meticulously selected and trained can substantially reduce the computational burden of calibrating complex microsimulation models.
Calibrating a microsimulation model, a process to find unobservable parameters so that the model fits observed data, is computationally complex.We used a deep neural network model, a popular machine learning algorithm, to calibrate the Colorectal Cancer Adenoma Incidence and Mortality (CRC-AIM) model.We demonstrated that our approach provides an efficient and accurate method to significantly speed up calibration in microsimulation models.The calibration process successfully provided cross-model validation of CRC-AIM against 3 established CISNET models and also externally validated against a randomized controlled trial.
基于机器学习(ML)的仿真器可改善决策分析模型的校准,但它们在复杂的微观模拟模型中的性能仍有待确定。
我们展示了如何在 Colorectal Cancer (CRC)-Adenoma Incidence and Mortality (CRC-AIM) 模型中使用基于 ML 的仿真器,该模型包含 23 个未知的自然史输入参数,以复制美国的 CRC 流行病学。我们首先生成 15000 个输入组合,并运行 CRC-AIM 模型来评估 CRC 发病率、腺瘤大小分布以及结肠镜检查检测到的小腺瘤比例。然后,我们使用这个数据集来训练几种 ML 算法,包括深度神经网络(DNN)、随机森林和几种梯度提升变体(即 XGBoost、LightGBM、CatBoost),并比较它们的性能。我们使用选定的仿真器评估了 1000 万个潜在的输入组合,并检查了最佳估计观察校准目标的输入组合。此外,我们还使用 CISNET 模型对 CRC-AIM 模型生成的结果进行了交叉验证。使用英国 Flexile Sigmoidoscopy Screening Trial (UKFSST) 对校准后的 CRC-AIM 模型进行了外部验证。
经过适当预处理的 DNN 优于其他经过测试的 ML 算法,成功预测了不同输入组合的所有 8 种结果。训练好的 DNN 预测 1000 万个输入的结果需要 473 秒,而没有我们的 DNN 则需要 190 个 CPU 年。整个校准过程需要 104 个 CPU 天,包括构建数据集、训练、选择和调整 ML 算法的超参数。虽然有 7 个输入组合的拟合度可以接受,但选择一个最能拟合所有结果的组合作为最佳向量。最佳向量的几乎所有预测都在 CISNET 模型的预测范围内,表明 CRC-AIM 具有跨模型的有效性。同样,CRC-AIM 准确预测了 UKFSST 报告的 CRC 发病率和死亡率的风险比,表明其具有外部有效性。对校准目标的影响的检查表明,在基于筛查的预期寿命获益方面,校准目标的选择对模型结果具有实质性影响。
经过精心选择和训练的仿真器(如 DNN)可以大大减少校准复杂微观模拟模型的计算负担。
校准一个微观模拟模型,即找到不可观测参数的过程,以便模型与观察数据拟合,是一个计算复杂的过程。我们使用了一个深度神经网络模型,一种流行的机器学习算法,来校准 Colorectal Cancer Adenoma Incidence and Mortality (CRC-AIM) 模型。我们证明了我们的方法提供了一种高效准确的方法,可以显著加快微观模拟模型中的校准过程。校准过程成功地对 CRC-AIM 进行了跨 3 个已建立的 CISNET 模型的验证,并对随机对照试验进行了外部验证。