Vector Institute, Toronto, Canada.
Temerity School of Medicine, University of Toronto, Toronto, Canada.
Nat Commun. 2024 Feb 29;15(1):1887. doi: 10.1038/s41467-024-46142-w.
While it is common to monitor deployed clinical artificial intelligence (AI) models for performance degradation, it is less common for the input data to be monitored for data drift - systemic changes to input distributions. However, when real-time evaluation may not be practical (eg., labeling costs) or when gold-labels are automatically generated, we argue that tracking data drift becomes a vital addition for AI deployments. In this work, we perform empirical experiments on real-world medical imaging to evaluate three data drift detection methods' ability to detect data drift caused (a) naturally (emergence of COVID-19 in X-rays) and (b) synthetically. We find that monitoring performance alone is not a good proxy for detecting data drift and that drift-detection heavily depends on sample size and patient features. Our work discusses the need and utility of data drift detection in various scenarios and highlights gaps in knowledge for the practical application of existing methods.
虽然监测已部署的临床人工智能 (AI) 模型的性能下降很常见,但监测输入数据是否存在数据漂移(即输入分布的系统性变化)则不太常见。然而,当实时评估不可行(例如,标记成本)或当金标签自动生成时,我们认为跟踪数据漂移对于 AI 部署来说是一个重要的补充。在这项工作中,我们在真实的医学成像上进行了实证实验,以评估三种数据漂移检测方法在检测以下两种情况下的数据漂移的能力:(a) 自然发生的(例如 X 光片中 COVID-19 的出现)和 (b) 人为合成的。我们发现,仅监测性能并不能很好地检测数据漂移,而且漂移检测严重依赖于样本量和患者特征。我们的工作讨论了在各种场景下数据漂移检测的必要性和实用性,并强调了现有方法实际应用中的知识差距。