用于时间序列分类的稳健解释器推荐

Robust explainer recommendation for time series classification.

作者信息

Nguyen Thu Trang, Le Nguyen Thach, Ifrim Georgiana

机构信息

School of Computer Science, University College Dublin, Dublin, Ireland.

出版信息

Data Min Knowl Discov. 2024;38(6):3372-3413. doi: 10.1007/s10618-024-01045-8. Epub 2024 Jun 20.

DOI:10.1007/s10618-024-01045-8

PMID:39473587

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11513768/

Abstract

Time series classification is a task which deals with temporal sequences, a prevalent data type common in domains such as human activity recognition, sports analytics and general sensing. In this area, interest in explanability has been growing as explanation is key to understand the data and the model better. Recently, a great variety of techniques (e.g., LIME, SHAP, CAM) have been proposed and adapted for time series to provide explanation in the form of , where the importance of each data point in the time series is quantified with a numerical value. However, the saliency maps can and often disagree, so it is unclear which one to use. This paper provides a novel framework to . We show how to robustly evaluate the informativeness of a given explanation method (i.e., relevance for the classification task), and how to compare explanations side-by-side. We propose AMEE, a Model-Agnostic Explanation Evaluation framework, for recommending saliency-based explanations for time series classification. In this approach, data perturbation is added to the input time series guided by each explanation. Our results show that perturbing discriminative parts of the time series leads to significant changes in classification accuracy, which can be used to evaluate each explanation. To be robust to different types of perturbations and different types of classifiers, we aggregate the accuracy loss across perturbations and classifiers. This novel approach allows us to recommend the best explainer among a set of different explainers, including random and oracle explainers. We provide a quantitative and qualitative analysis for synthetic datasets, a variety of time-series datasets, as well as a real-world case study with known expert ground truth.

摘要

时间序列分类是一项处理时间序列的任务，时间序列是一种在人类活动识别、体育分析和一般传感等领域常见的流行数据类型。在这个领域，对可解释性的兴趣一直在增长，因为解释是更好地理解数据和模型的关键。最近，已经提出并适用于时间序列的各种技术（例如，LIME、SHAP、CAM），以提供以形式的解释，其中时间序列中每个数据点的重要性用一个数值来量化。然而，显著性图可能而且经常不一致，所以不清楚该使用哪一个。本文提供了一个新颖的框架来。我们展示了如何稳健地评估给定解释方法的信息性（即与分类任务的相关性），以及如何并排比较解释。我们提出了AMEE，一个模型无关的解释评估框架，用于为时间序列分类推荐基于显著性的解释。在这种方法中，数据扰动被添加到由每个解释引导的输入时间序列中。我们的结果表明，扰动时间序列的判别部分会导致分类准确率的显著变化，这可用于评估每个解释。为了对不同类型的扰动和不同类型的分类器具有鲁棒性，我们汇总了跨扰动和分类器的准确率损失。这种新颖的方法使我们能够在一组不同的解释器中推荐最佳解释器，包括随机和神谕解释器。我们对合成数据集、各种时间序列数据集以及具有已知专家地面真值的实际案例研究进行了定量和定性分析。