我们在测试数据上进行训练吗？清除CIFAR中的近似重复数据。

Do We Train on Test Data? Purging CIFAR of Near-Duplicates.

作者信息

Barz Björn, Denzler Joachim

机构信息

Computer Vision Group, Friedrich Schiller University Jena, Ernst-Abbe-Platz 2, 07743 Jena, Germany.

出版信息

J Imaging. 2020 Jun 2;6(6):41. doi: 10.3390/jimaging6060041.

DOI:10.3390/jimaging6060041

PMID:34460587

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8321059/

Abstract

The CIFAR-10 and CIFAR-100 datasets are two of the most heavily benchmarked datasets in computer vision and are often used to evaluate novel methods and model architectures in the field of deep learning. However, we find that 3.3% and 10% of the images from the test sets of these datasets have duplicates in the training set. These duplicates are easily recognizable by memorization and may, hence, bias the comparison of image recognition techniques regarding their generalization capability. To eliminate this bias, we provide the "fair CIFAR" (ciFAIR) dataset, where we replaced all duplicates in the test sets with new images sampled from the same domain. The training set remains unchanged, in order not to invalidate pre-trained models. We then re-evaluate the classification performance of various popular state-of-the-art CNN architectures on these new test sets to investigate whether recent research has overfitted to memorizing data instead of learning abstract concepts. We find a significant drop in classification accuracy of between 9% and 14% relative to the original performance on the duplicate-free test set. We make both the ciFAIR dataset and pre-trained models publicly available and furthermore maintain a leaderboard for tracking the state of the art.

摘要

CIFAR-10和CIFAR-100数据集是计算机视觉领域中基准测试使用最为频繁的两个数据集，常用于评估深度学习领域的新方法和模型架构。然而，我们发现这些数据集测试集中分别有3.3%和10%的图像在训练集中存在重复。这些重复图像很容易通过记忆识别出来，因此可能会在图像识别技术泛化能力的比较中产生偏差。为了消除这种偏差，我们提供了“公平CIFAR”（ciFAIR）数据集，我们用从同一域中采样的新图像替换了测试集中所有的重复图像。训练集保持不变，以免使预训练模型无效。然后，我们在这些新的测试集上重新评估各种流行的先进卷积神经网络架构的分类性能，以研究近期的研究是否过度拟合于记忆数据而非学习抽象概念。我们发现，相对于在无重复测试集上的原始性能，分类准确率显著下降了9%至14%。我们将ciFAIR数据集和预训练模型都公开提供，并且还维护了一个排行榜来跟踪当前的技术水平。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

我们在测试数据上进行训练吗？清除CIFAR中的近似重复数据。

Do We Train on Test Data? Purging CIFAR of Near-Duplicates.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

我们在测试数据上进行训练吗？清除CIFAR中的近似重复数据。

Do We Train on Test Data? Purging CIFAR of Near-Duplicates.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献