基于 DNN 的有噪标注和负样本鲁棒损失函数在远监督关系抽取中的应用。

A noisy label and negative sample robust loss function for DNN-based distant supervised relation extraction.

机构信息

School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu, Sichuan 611731, China.

出版信息

Neural Netw. 2021 Jul;139:358-370. doi: 10.1016/j.neunet.2021.03.030. Epub 2021 Apr 1.

DOI:10.1016/j.neunet.2021.03.030

PMID:33901772

Abstract

As a major method for relation extraction, distantly supervised relation extraction (DSRE) suffered from the noisy label problem and class imbalance problem (these two problems are also common for many other NLP tasks, e.g., text classification). However, there seems no existing research in DSRE or other NLP tasks that can simultaneously solve both problems, which is a significant insufficiency in related researches. In this paper, we propose a loss function which is robust to noisy label and efficient for the imbalanced class dataset. More specific, first we quantify the negative impacts of the noisy label and class imbalance problems. And then we construct a loss function that can minimize these negative impacts through a linear programming method. As far as we know, this seems to be the first attempt to address the noisy label problem and class imbalance problem simultaneously. We evaluated the constructed loss function on the distantly labeled dataset, our artificially noised dataset, human-annotated dataset of Docred, as well as the artificially noised dataset of CoNLL 2003. Experimental results indicate that a DNN model adopting the constructed loss function can outperform other models that adopt the state-of-the-art noisy label robust or negative sample robust loss functions.

摘要

作为关系抽取的主要方法，远程监督关系抽取（DSRE）受到噪声标签问题和类不平衡问题的困扰（这两个问题也存在于许多其他自然语言处理任务中，例如文本分类）。然而，在 DSRE 或其他自然语言处理任务中，似乎没有现有的研究能够同时解决这两个问题，这是相关研究中的一个显著不足。在本文中，我们提出了一种对噪声标签和不平衡类数据集都具有鲁棒性的损失函数。更具体地说，我们首先量化了噪声标签和类不平衡问题的负面影响。然后，我们通过线性规划方法构建了一个可以最小化这些负面影响的损失函数。据我们所知，这似乎是首次尝试同时解决噪声标签问题和类不平衡问题。我们在远程标记数据集、我们人为噪声数据集、Docred 的人工注释数据集以及 CoNLL 2003 的人为噪声数据集上评估了所构建的损失函数。实验结果表明，采用所构建的损失函数的 DNN 模型可以优于采用最新的噪声标签鲁棒或负样本鲁棒损失函数的其他模型。

相似文献

A noisy label and negative sample robust loss function for DNN-based distant supervised relation extraction.基于 DNN 的有噪标注和负样本鲁棒损失函数在远监督关系抽取中的应用。

Neural Netw. 2021 Jul;139:358-370. doi: 10.1016/j.neunet.2021.03.030. Epub 2021 Apr 1.

Semi-supervised learning for medical image classification using imbalanced training data.基于不平衡训练数据的医学图像分类的半监督学习。

Comput Methods Programs Biomed. 2022 Apr;216:106628. doi: 10.1016/j.cmpb.2022.106628. Epub 2022 Jan 14.

Distantly Supervised Biomedical Relation Extraction via Negative Learning and Noisy Student Self-Training.通过负学习和噪声学生自训练进行远程监督生物医学关系提取

IEEE/ACM Trans Comput Biol Bioinform. 2024 Nov-Dec;21(6):1697-1708. doi: 10.1109/TCBB.2024.3412174. Epub 2024 Dec 10.

Bayesian statistics-guided label refurbishment mechanism: Mitigating label noise in medical image classification.贝叶斯统计引导的标签修复机制：减轻医学图像分类中的标签噪声。

Med Phys. 2022 Sep;49(9):5899-5913. doi: 10.1002/mp.15799. Epub 2022 Jun 22.

Robust co-teaching learning with consistency-based noisy label correction for medical image classification.用于医学图像分类的基于一致性的噪声标签校正的稳健协同教学学习

Int J Comput Assist Radiol Surg. 2023 Apr;18(4):675-683. doi: 10.1007/s11548-022-02799-6. Epub 2022 Nov 27.

Distant Supervision Relation Extraction via adaptive dependency-path and additional knowledge graph supervision.基于自适应依赖路径和额外知识图监督的远距离关系抽取。

Neural Netw. 2021 Feb;134:42-53. doi: 10.1016/j.neunet.2020.10.012. Epub 2020 Nov 21.

Learning From Crowds With Multiple Noisy Label Distribution Propagation.基于多噪声标签分布传播的众包学习

IEEE Trans Neural Netw Learn Syst. 2022 Nov;33(11):6558-6568. doi: 10.1109/TNNLS.2021.3082496. Epub 2022 Oct 27.

Invariant feature based label correction for DNN when Learning with Noisy Labels.基于不变特征的 DNN 标签校正，用于有噪声标签的学习。

Neural Netw. 2024 Apr;172:106137. doi: 10.1016/j.neunet.2024.106137. Epub 2024 Jan 29.

Utilizing Entity-Based Gated Convolution and Multilevel Sentence Attention to Improve Distantly Supervised Relation Extraction.利用基于实体的门控卷积和多层次句子注意力提高远程监督关系抽取。

Comput Intell Neurosci. 2021 Nov 1;2021:6110885. doi: 10.1155/2021/6110885. eCollection 2021.

BadLabel: A Robust Perspective on Evaluating and Enhancing Label-Noise Learning.不良标签：关于评估和增强标签噪声学习的稳健视角

IEEE Trans Pattern Anal Mach Intell. 2024 Jun;46(6):4398-4409. doi: 10.1109/TPAMI.2024.3355425. Epub 2024 May 7.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

基于 DNN 的有噪标注和负样本鲁棒损失函数在远监督关系抽取中的应用。

A noisy label and negative sample robust loss function for DNN-based distant supervised relation extraction.

机构信息

出版信息

相似文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献