Suppr超能文献

在最大间隔句法分析中针对性能度量进行优化。

Optimizing for Measure of Performance in Max-Margin Parsing.

作者信息

Bauer Alexander, Nakajima Shinichi, Gornitz Nico, Muller Klaus-Robert

出版信息

IEEE Trans Neural Netw Learn Syst. 2020 Jul;31(7):2680-2684. doi: 10.1109/TNNLS.2019.2934225. Epub 2019 Sep 5.

Abstract

Many learning tasks in the field of natural language processing including sequence tagging, sequence segmentation, and syntactic parsing have been successfully approached by means of structured prediction methods. An appealing property of the corresponding training algorithms is their ability to integrate the loss function of interest into the optimization process improving the final results according to the chosen measure of performance. Here, we focus on the task of constituency parsing and show how to optimize the model for the F -score in the max-margin framework of a structural support vector machine (SVM). For reasons of computational efficiency, it is a common approach to binarize the corresponding grammar before training. Unfortunately, this introduces a bias during the training procedure as the corresponding loss function is evaluated on the binary representation, while the resulting performance is measured on the original unbinarized trees. Here, we address this problem by extending the inference procedure presented by Bauer et al. Specifically, we propose an algorithmic modification that allows evaluating the loss on the unbinarized trees. The new approach properly models the loss function of interest resulting in better prediction accuracy and still benefits from the computational efficiency due to binarized representation. The presented idea can be easily transferred to other structured loss functions.

摘要

自然语言处理领域中的许多学习任务,包括序列标记、序列分割和句法分析,都已通过结构化预测方法成功解决。相应训练算法的一个吸引人的特性是它们能够将感兴趣的损失函数集成到优化过程中,从而根据所选的性能度量来提高最终结果。在这里,我们专注于成分句法分析任务,并展示如何在结构支持向量机(SVM)的最大间隔框架中针对F值优化模型。出于计算效率的考虑,在训练前对相应的语法进行二值化是一种常见的方法。不幸的是,这在训练过程中引入了偏差,因为相应的损失函数是在二进制表示上进行评估的,而最终的性能是在原始的未二值化的树状结构上进行测量的。在这里,我们通过扩展鲍尔等人提出的推理过程来解决这个问题。具体来说,我们提出了一种算法修改,允许在未二值化的树状结构上评估损失。新方法正确地对感兴趣的损失函数进行建模,从而提高预测准确性,并且由于二值化表示仍然受益于计算效率。所提出的想法可以很容易地转移到其他结构化损失函数上。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验