Digital Engineering Faculty, University of Potsdam, Potsdam, Germany.
Digital Health and Machine Learning, Hasso-Plattner-Institute, Potsdam, Germany.
BMC Med Res Methodol. 2023 Aug 19;23(1):187. doi: 10.1186/s12874-023-02003-6.
Machine learning models promise to support diagnostic predictions, but may not perform well in new settings. Selecting the best model for a new setting without available data is challenging. We aimed to investigate the transportability by calibration and discrimination of prediction models for cognitive impairment in simulated external settings with different distributions of demographic and clinical characteristics.
We mapped and quantified relationships between variables associated with cognitive impairment using causal graphs, structural equation models, and data from the ADNI study. These estimates were then used to generate datasets and evaluate prediction models with different sets of predictors. We measured transportability to external settings under guided interventions on age, APOE ε4, and tau-protein, using performance differences between internal and external settings measured by calibration metrics and area under the receiver operating curve (AUC).
Calibration differences indicated that models predicting with causes of the outcome were more transportable than those predicting with consequences. AUC differences indicated inconsistent trends of transportability between the different external settings. Models predicting with consequences tended to show higher AUC in the external settings compared to internal settings, while models predicting with parents or all variables showed similar AUC.
We demonstrated with a practical prediction task example that predicting with causes of the outcome results in better transportability compared to anti-causal predictions when considering calibration differences. We conclude that calibration performance is crucial when assessing model transportability to external settings.
机器学习模型有望支持诊断预测,但在新环境中可能表现不佳。在没有可用数据的情况下选择最适合新环境的模型是具有挑战性的。我们旨在研究在具有不同人口统计学和临床特征分布的模拟外部环境中,通过校准和区分认知障碍预测模型来研究可转移性。
我们使用因果图、结构方程模型和 ADNI 研究中的数据来映射和量化与认知障碍相关的变量之间的关系。然后,我们使用这些估计值来生成数据集并评估具有不同预测因子集的预测模型。我们通过校准指标和接收者操作曲线下的面积(AUC)来测量在年龄、APOE ε4 和 tau 蛋白的指导干预下对外部环境的可转移性,以内部和外部环境之间的性能差异来衡量。
校准差异表明,预测结果原因的模型比预测结果结果的模型更具可转移性。AUC 差异表明,不同外部环境之间的可转移性存在不一致的趋势。与内部环境相比,预测结果结果的模型在外部环境中往往表现出更高的 AUC,而预测父母或所有变量的模型则表现出相似的 AUC。
我们通过一个实际的预测任务示例证明,在考虑校准差异时,与反因果预测相比,预测结果原因的模型可转移性更好。我们得出结论,校准性能对于评估模型对外部环境的可转移性至关重要。