堆叠型 DeBERTa：文本分类中针对不完整数据的全注意力

Stacked DeBERT: All attention in incomplete data for text classification.

机构信息

School Electronics and Electrical Engineering, South Korea.

Department of Artificial Intelligence, Kyungpook National University, Daegu, 41566, South Korea.

出版信息

Neural Netw. 2021 Apr;136:87-96. doi: 10.1016/j.neunet.2020.12.018. Epub 2020 Dec 25.

DOI:10.1016/j.neunet.2020.12.018

PMID:33453522

Abstract

In this paper, we propose Stacked DeBERT, short for StackedDenoising Bidirectional Encoder Representations from Transformers. This novel model improves robustness in incomplete data, when compared to existing systems, by designing a novel encoding scheme in BERT, a powerful language representation model solely based on attention mechanisms. Incomplete data in natural language processing refer to text with missing or incorrect words, and its presence can hinder the performance of current models that were not implemented to withstand such noises, but must still perform well even under duress. This is due to the fact that current approaches are built for and trained with clean and complete data, and thus are not able to extract features that can adequately represent incomplete data. Our proposed approach consists of obtaining intermediate input representations by applying an embedding layer to the input tokens followed by vanilla transformers. These intermediate features are given as input to novel denoising transformers which are responsible for obtaining richer input representations. The proposed approach takes advantage of stacks of multilayer perceptrons for the reconstruction of missing words' embeddings by extracting more abstract and meaningful hidden feature vectors, and bidirectional transformers for improved embedding representation. We consider two datasets for training and evaluation: the Chatbot Natural Language Understanding Evaluation Corpus and Kaggle's Twitter Sentiment Corpus. Our model shows improved F1-scores and better robustness in informal/incorrect texts present in tweets and in texts with Speech-to-Text error in the sentiment and intent classification tasks..

摘要

在本文中，我们提出了 Stacked DeBERT，它是 StackedDenoising Bidirectional Encoder Representations from Transformers 的缩写。与现有系统相比，该新型模型通过在 BERT 中设计一种新的编码方案，提高了在不完整数据下的鲁棒性。BERT 是一种强大的语言表示模型，仅基于注意力机制。自然语言处理中的不完整数据是指带有缺失或错误单词的文本，其存在会阻碍当前模型的性能，这些模型不是为了承受这种噪声而设计的，但即使在压力下也必须表现良好。这是因为当前的方法是为干净和完整的数据而构建和训练的，因此无法提取能够充分表示不完整数据的特征。我们提出的方法包括通过将嵌入层应用于输入令牌来获得中间输入表示，然后是香草变压器。这些中间特征作为输入提供给新的去噪变压器，负责获得更丰富的输入表示。所提出的方法利用多层感知机的堆栈通过提取更抽象和有意义的隐藏特征向量来重建缺失单词的嵌入，以及双向变压器来改善嵌入表示。我们考虑了两个用于训练和评估的数据集：Chatbot Natural Language Understanding Evaluation Corpus 和 Kaggle 的 Twitter Sentiment Corpus。我们的模型在情感和意图分类任务中，在带有 Speech-to-Text 错误的文本和带有非正式/错误的推文的文本中，显示出了更高的 F1 分数和更好的鲁棒性。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

堆叠型 DeBERTa：文本分类中针对不完整数据的全注意力

Stacked DeBERT: All attention in incomplete data for text classification.

机构信息

出版信息

相似文献

堆叠型 DeBERTa：文本分类中针对不完整数据的全注意力

Stacked DeBERT: All attention in incomplete data for text classification.

机构信息

出版信息

相似文献