AUBER：自动 BERT 正则化。

AUBER: Automated BERT regularization.

机构信息

Columbia University, New York, NY, United States of America.

Seoul National University, Seoul, Republic of Korea.

出版信息

PLoS One. 2021 Jun 28;16(6):e0253241. doi: 10.1371/journal.pone.0253241. eCollection 2021.

DOI:10.1371/journal.pone.0253241

PMID:34181664

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8238198/

Abstract

How can we effectively regularize BERT? Although BERT proves its effectiveness in various NLP tasks, it often overfits when there are only a small number of training instances. A promising direction to regularize BERT is based on pruning its attention heads with a proxy score for head importance. However, these methods are usually suboptimal since they resort to arbitrarily determined numbers of attention heads to be pruned and do not directly aim for the performance enhancement. In order to overcome such a limitation, we propose AUBER, an automated BERT regularization method, that leverages reinforcement learning to automatically prune the proper attention heads from BERT. We also minimize the model complexity and the action search space by proposing a low-dimensional state representation and dually-greedy approach for training. Experimental results show that AUBER outperforms existing pruning methods by achieving up to 9.58% better performance. In addition, the ablation study demonstrates the effectiveness of design choices for AUBER.

摘要

如何有效地正则化 BERT？虽然 BERT 在各种自然语言处理任务中证明了其有效性，但当训练实例数量很少时，它经常会过拟合。正则化 BERT 的一个有前途的方向是基于用头重要性的代理分数来修剪其注意力头。然而，这些方法通常不是最优的，因为它们依赖于任意确定的要修剪的注意力头的数量，并且不直接针对性能增强。为了克服这种局限性，我们提出了 AUBER，一种自动化的 BERT 正则化方法，它利用强化学习从 BERT 中自动修剪适当的注意力头。我们还通过提出低维状态表示和双重贪婪方法来最小化模型复杂度和动作搜索空间。实验结果表明，AUBER 通过实现高达 9.58%的更好性能，优于现有的修剪方法。此外，消融研究证明了 AUBER 的设计选择的有效性。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

AUBER：自动 BERT 正则化。

AUBER: Automated BERT regularization.

机构信息

出版信息

相似文献

引用本文的文献

AUBER：自动 BERT 正则化。

AUBER: Automated BERT regularization.

机构信息

出版信息

相似文献

引用本文的文献