Suppr超能文献

揭开聪明汉斯预测者的面具,评估机器真正学到了什么。

Unmasking Clever Hans predictors and assessing what machines really learn.

机构信息

Department of Video Coding & Analytics, Fraunhofer Heinrich Hertz Institute, Einsteinufer 37, 10587, Berlin, Germany.

Department of Electrical Engineering and Computer Science, Technische Universität Berlin, Marchstr. 23, 10587, Berlin, Germany.

出版信息

Nat Commun. 2019 Mar 11;10(1):1096. doi: 10.1038/s41467-019-08987-4.

Abstract

Current learning machines have successfully solved hard application problems, reaching high accuracy and displaying seemingly intelligent behavior. Here we apply recent techniques for explaining decisions of state-of-the-art learning machines and analyze various tasks from computer vision and arcade games. This showcases a spectrum of problem-solving behaviors ranging from naive and short-sighted, to well-informed and strategic. We observe that standard performance evaluation metrics can be oblivious to distinguishing these diverse problem solving behaviors. Furthermore, we propose our semi-automated Spectral Relevance Analysis that provides a practically effective way of characterizing and validating the behavior of nonlinear learning machines. This helps to assess whether a learned model indeed delivers reliably for the problem that it was conceived for. Furthermore, our work intends to add a voice of caution to the ongoing excitement about machine intelligence and pledges to evaluate and judge some of these recent successes in a more nuanced manner.

摘要

当前的学习机器已经成功地解决了困难的应用问题,达到了很高的准确性,并表现出看似智能的行为。在这里,我们应用了最新的技术来解释最先进的学习机器的决策,并分析了计算机视觉和街机游戏中的各种任务。这展示了一系列从天真和短视到消息灵通和策略性的问题解决行为。我们观察到,标准的性能评估指标可能无法区分这些不同的问题解决行为。此外,我们提出了我们的半自动谱相关性分析,为非线性学习机器的行为提供了一种实用有效的特征描述和验证方法。这有助于评估学习模型是否确实能够可靠地解决其设计初衷的问题。此外,我们的工作旨在为当前关于机器智能的兴奋情绪增添一份警示,并承诺以更细致的方式评估和判断其中的一些近期成功。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e01d/6411769/780ce214a78e/41467_2019_8987_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验