CAUSALab, Harvard T.H. Chan School of Public Health, Boston, MA, USA.
Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, USA.
Eur J Epidemiol. 2022 Jun;37(6):563-568. doi: 10.1007/s10654-022-00892-3. Epub 2022 Jul 6.
With the rising use of machine learning for healthcare applications, practitioners are increasingly confronted with the limitations of prediction models that are trained in one setting but meant to be deployed in several others. One recently identified limitation is so-called shortcut learning, whereby a model learns to associate features with the prediction target that do not maintain their relationship across settings. Famously, the watermark on chest x-rays has been demonstrated to be an instance of a shortcut feature. In this viewpoint, we attempt to give a structural characterization of shortcut features in terms of causal DAGs. This is the first attempt at defining shortcut features in terms of their causal relationship with a model's prediction target.
随着机器学习在医疗保健应用中的使用日益增多,从业者越来越多地面临着这样的挑战:训练有素的预测模型适用于一种环境,但却需要在多种环境中部署。最近发现的一个局限性是所谓的捷径学习,即模型学会将特征与预测目标相关联,但这些特征在不同环境中不再保持其关系。众所周知,胸部 X 光片上的水印就是捷径特征的一个实例。在本观点中,我们试图根据因果 DAG 来对捷径特征进行结构特征化。这是首次尝试根据其与模型预测目标的因果关系来定义捷径特征。