Suppr超能文献

学习合成用于数据增强的特定领域变换。

Learning to Compose Domain-Specific Transformations for Data Augmentation.

作者信息

Ratner Alexander J, Ehrenberg Henry R, Hussain Zeshan, Dunnmon Jared, Ré Christopher

机构信息

Stanford University.

出版信息

Adv Neural Inf Process Syst. 2017 Dec;30:3239-3249.

Abstract

Data augmentation is a ubiquitous technique for increasing the size of labeled training sets by leveraging task-specific data transformations that preserve class labels. While it is often easy for domain experts to specify individual transformations, constructing and tuning the more sophisticated compositions typically needed to achieve state-of-the-art results is a time-consuming manual task in practice. We propose a method for automating this process by learning a generative sequence model over user-specified transformation functions using a generative adversarial approach. Our method can make use of arbitrary, non-deterministic transformation functions, is robust to misspecified user input, and is trained on unlabeled data. The learned transformation model can then be used to perform data augmentation for any end discriminative model. In our experiments, we show the efficacy of our approach on both image and text datasets, achieving improvements of 4.0 accuracy points on CIFAR-10, 1.4 F1 points on the ACE relation extraction task, and 3.4 accuracy points when using domain-specific transformation operations on a medical imaging dataset as compared to standard heuristic augmentation approaches.

摘要

数据增强是一种普遍使用的技术,通过利用保留类别标签的特定任务数据变换来增加标记训练集的大小。虽然领域专家通常很容易指定单个变换,但构建和调整实现最先进结果通常所需的更复杂的组合在实践中是一项耗时的手动任务。我们提出了一种方法,通过使用生成对抗方法在用户指定的变换函数上学习生成序列模型来自动化这个过程。我们的方法可以使用任意的、非确定性的变换函数,对错误指定的用户输入具有鲁棒性,并且在未标记数据上进行训练。然后,学习到的变换模型可以用于为任何最终判别模型执行数据增强。在我们的实验中,我们展示了我们的方法在图像和文本数据集上的有效性,与标准启发式增强方法相比,在CIFAR-10上准确率提高了4.0个百分点,在ACE关系提取任务上F1分数提高了1.4分,在医学成像数据集上使用特定领域变换操作时准确率提高了3.4个百分点。

相似文献

10
Active Appearance Model Induced Generative Adversarial Network for Controlled Data Augmentation.用于可控数据增强的主动外观模型诱导生成对抗网络
Med Image Comput Comput Assist Interv. 2019 Oct;11764:201-208. doi: 10.1007/978-3-030-32239-7_23. Epub 2019 Oct 10.

引用本文的文献

4
Semantic-aware Video Representation for Few-shot Action Recognition.用于少样本动作识别的语义感知视频表示
IEEE Winter Conf Appl Comput Vis. 2024 Jan;2024:6444-6454. doi: 10.1109/wacv57701.2024.00633. Epub 2024 Apr 9.
5
Brain-inspired semantic data augmentation for multi-style images.用于多风格图像的脑启发式语义数据增强
Front Neurorobot. 2024 Mar 26;18:1382406. doi: 10.3389/fnbot.2024.1382406. eCollection 2024.
9
Regularization for Unsupervised Learning of Optical Flow.无监督光流学习的正则化。
Sensors (Basel). 2023 Apr 18;23(8):4080. doi: 10.3390/s23084080.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验