• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

学习合成用于数据增强的特定领域变换。

Learning to Compose Domain-Specific Transformations for Data Augmentation.

作者信息

Ratner Alexander J, Ehrenberg Henry R, Hussain Zeshan, Dunnmon Jared, Ré Christopher

机构信息

Stanford University.

出版信息

Adv Neural Inf Process Syst. 2017 Dec;30:3239-3249.

PMID:29375240
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5786274/
Abstract

Data augmentation is a ubiquitous technique for increasing the size of labeled training sets by leveraging task-specific data transformations that preserve class labels. While it is often easy for domain experts to specify individual transformations, constructing and tuning the more sophisticated compositions typically needed to achieve state-of-the-art results is a time-consuming manual task in practice. We propose a method for automating this process by learning a generative sequence model over user-specified transformation functions using a generative adversarial approach. Our method can make use of arbitrary, non-deterministic transformation functions, is robust to misspecified user input, and is trained on unlabeled data. The learned transformation model can then be used to perform data augmentation for any end discriminative model. In our experiments, we show the efficacy of our approach on both image and text datasets, achieving improvements of 4.0 accuracy points on CIFAR-10, 1.4 F1 points on the ACE relation extraction task, and 3.4 accuracy points when using domain-specific transformation operations on a medical imaging dataset as compared to standard heuristic augmentation approaches.

摘要

数据增强是一种普遍使用的技术,通过利用保留类别标签的特定任务数据变换来增加标记训练集的大小。虽然领域专家通常很容易指定单个变换,但构建和调整实现最先进结果通常所需的更复杂的组合在实践中是一项耗时的手动任务。我们提出了一种方法,通过使用生成对抗方法在用户指定的变换函数上学习生成序列模型来自动化这个过程。我们的方法可以使用任意的、非确定性的变换函数,对错误指定的用户输入具有鲁棒性,并且在未标记数据上进行训练。然后,学习到的变换模型可以用于为任何最终判别模型执行数据增强。在我们的实验中,我们展示了我们的方法在图像和文本数据集上的有效性,与标准启发式增强方法相比,在CIFAR-10上准确率提高了4.0个百分点,在ACE关系提取任务上F1分数提高了1.4分,在医学成像数据集上使用特定领域变换操作时准确率提高了3.4个百分点。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a2a6/5786274/1551cb4c3b15/nihms933862f7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a2a6/5786274/c86e8380e51d/nihms933862f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a2a6/5786274/3a7a0aa4f893/nihms933862f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a2a6/5786274/2bf9c9316767/nihms933862f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a2a6/5786274/fb8bd997691f/nihms933862f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a2a6/5786274/a1f47ecdfc8a/nihms933862f5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a2a6/5786274/76187c5d59c4/nihms933862f6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a2a6/5786274/1551cb4c3b15/nihms933862f7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a2a6/5786274/c86e8380e51d/nihms933862f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a2a6/5786274/3a7a0aa4f893/nihms933862f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a2a6/5786274/2bf9c9316767/nihms933862f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a2a6/5786274/fb8bd997691f/nihms933862f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a2a6/5786274/a1f47ecdfc8a/nihms933862f5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a2a6/5786274/76187c5d59c4/nihms933862f6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a2a6/5786274/1551cb4c3b15/nihms933862f7.jpg

相似文献

1
Learning to Compose Domain-Specific Transformations for Data Augmentation.学习合成用于数据增强的特定领域变换。
Adv Neural Inf Process Syst. 2017 Dec;30:3239-3249.
2
Cross-domain attention-guided generative data augmentation for medical image analysis with limited data.基于跨领域注意力引导的生成式数据扩充方法,可用于有限数据条件下的医学图像分析。
Comput Biol Med. 2024 Jan;168:107744. doi: 10.1016/j.compbiomed.2023.107744. Epub 2023 Nov 23.
3
Semi-supervised task-driven data augmentation for medical image segmentation.半监督任务驱动的数据增强在医学图像分割中的应用。
Med Image Anal. 2021 Feb;68:101934. doi: 10.1016/j.media.2020.101934. Epub 2020 Dec 9.
4
Adversarial and Random Transformations for Robust Domain Adaptation and Generalization.对抗和随机变换在鲁棒域自适应和泛化中的应用。
Sensors (Basel). 2023 Jun 1;23(11):5273. doi: 10.3390/s23115273.
5
Learning Deep Representations of Cardiac Structures for 4D Cine MRI Image Segmentation through Semi-Supervised Learning.通过半监督学习学习用于4D心脏电影磁共振成像图像分割的心脏结构深度表示。
Appl Sci (Basel). 2022 Dec 1;12(23). doi: 10.3390/app122312163. Epub 2022 Nov 28.
6
Data augmentation using generative adversarial networks (CycleGAN) to improve generalizability in CT segmentation tasks.使用生成对抗网络(CycleGAN)进行数据增强以提高 CT 分割任务的泛化能力。
Sci Rep. 2019 Nov 15;9(1):16884. doi: 10.1038/s41598-019-52737-x.
7
Generative adversarial network based synthetic data training model for lightweight convolutional neural networks.用于轻量级卷积神经网络的基于生成对抗网络的合成数据训练模型。
Multimed Tools Appl. 2023 May 20:1-23. doi: 10.1007/s11042-023-15747-6.
8
An Ensemble of Transfer Learning Models for the Prediction of Skin Cancers with Conditional Generative Adversarial Networks.一种用于通过条件生成对抗网络预测皮肤癌的迁移学习模型集成。
Diagnostics (Basel). 2022 Dec 13;12(12):3145. doi: 10.3390/diagnostics12123145.
9
Adversarial counterfactual augmentation: application in Alzheimer's disease classification.对抗性反事实增强:在阿尔茨海默病分类中的应用
Front Radiol. 2022 Nov 30;2:1039160. doi: 10.3389/fradi.2022.1039160. eCollection 2022.
10
Active Appearance Model Induced Generative Adversarial Network for Controlled Data Augmentation.用于可控数据增强的主动外观模型诱导生成对抗网络
Med Image Comput Comput Assist Interv. 2019 Oct;11764:201-208. doi: 10.1007/978-3-030-32239-7_23. Epub 2019 Oct 10.

引用本文的文献

1
UFOS-Net leverages small-scale feature fusion for diabetic foot ulcer segmentation.UFOS-Net利用小规模特征融合进行糖尿病足溃疡分割。
Sci Rep. 2025 Aug 11;15(1):29317. doi: 10.1038/s41598-025-12442-4.
2
Augmenting atmospheric turbulence effects on thermal-adapted deep object detection models.增强大气湍流对热适应深度目标检测模型的影响。
Sci Rep. 2025 Mar 22;15(1):9900. doi: 10.1038/s41598-025-86830-1.
3
Development of a cerebellar ataxia diagnosis model using conditional GAN-based synthetic data generation for visuomotor adaptation task.

本文引用的文献

1
RenderGAN: Generating Realistic Labeled Data.RenderGAN:生成逼真的带标签数据。
Front Robot AI. 2018 Jun 8;5:66. doi: 10.3389/frobt.2018.00066. eCollection 2018.
2
Discriminative Unsupervised Feature Learning with Exemplar Convolutional Neural Networks.基于示例卷积神经网络的判别式无监督特征学习。
IEEE Trans Pattern Anal Mach Intell. 2016 Sep;38(9):1734-47. doi: 10.1109/TPAMI.2015.2496141. Epub 2015 Oct 29.
3
The Cancer Imaging Archive (TCIA): maintaining and operating a public information repository.癌症影像档案库(TCIA):维护和运营公共信息知识库。
基于条件生成对抗网络的视动适应任务合成数据生成构建小脑性共济失调诊断模型。
BMC Med Inform Decis Mak. 2024 Nov 12;24(1):336. doi: 10.1186/s12911-024-02720-y.
4
Semantic-aware Video Representation for Few-shot Action Recognition.用于少样本动作识别的语义感知视频表示
IEEE Winter Conf Appl Comput Vis. 2024 Jan;2024:6444-6454. doi: 10.1109/wacv57701.2024.00633. Epub 2024 Apr 9.
5
Brain-inspired semantic data augmentation for multi-style images.用于多风格图像的脑启发式语义数据增强
Front Neurorobot. 2024 Mar 26;18:1382406. doi: 10.3389/fnbot.2024.1382406. eCollection 2024.
6
A Metric-Based Few-Shot Learning Method for Fish Species Identification with Limited Samples.一种基于度量的少样本学习方法用于有限样本下的鱼类物种识别
Animals (Basel). 2024 Feb 28;14(5):755. doi: 10.3390/ani14050755.
7
Augmented drug combination dataset to improve the performance of machine learning models predicting synergistic anticancer effects.增强型药物组合数据集,提高机器学习模型预测协同抗癌效果的性能。
Sci Rep. 2024 Jan 18;14(1):1668. doi: 10.1038/s41598-024-51940-9.
8
Augmented drug combination dataset to improve the performance of machine learning models predicting synergistic anticancer effects.增强型药物组合数据集,以提高预测协同抗癌效果的机器学习模型的性能。
Res Sq. 2023 Oct 28:rs.3.rs-3481858. doi: 10.21203/rs.3.rs-3481858/v1.
9
Regularization for Unsupervised Learning of Optical Flow.无监督光流学习的正则化。
Sensors (Basel). 2023 Apr 18;23(8):4080. doi: 10.3390/s23084080.
10
Evaluating semi-supervision methods for medical image segmentation: applications in cardiac magnetic resonance imaging.评估用于医学图像分割的半监督方法:在心脏磁共振成像中的应用
J Med Imaging (Bellingham). 2023 Mar;10(2):024007. doi: 10.1117/1.JMI.10.2.024007. Epub 2023 Mar 30.
J Digit Imaging. 2013 Dec;26(6):1045-57. doi: 10.1007/s10278-013-9622-7.
4
Deep, big, simple neural nets for handwritten digit recognition.深度、大型、简单的神经网络用于手写数字识别。
Neural Comput. 2010 Dec;22(12):3207-20. doi: 10.1162/NECO_a_00052. Epub 2010 Sep 21.
5
Enhancing text categorization with semantic-enriched representation and training data augmentation.通过语义丰富的表示和训练数据增强来提升文本分类
J Am Med Inform Assoc. 2006 Sep-Oct;13(5):526-35. doi: 10.1197/jamia.M2051. Epub 2006 Jun 23.