Bauer Alexander, Nakajima Shinichi, Gornitz Nico, Muller Klaus-Robert
IEEE Trans Neural Netw Learn Syst. 2020 Jul;31(7):2680-2684. doi: 10.1109/TNNLS.2019.2934225. Epub 2019 Sep 5.
Many learning tasks in the field of natural language processing including sequence tagging, sequence segmentation, and syntactic parsing have been successfully approached by means of structured prediction methods. An appealing property of the corresponding training algorithms is their ability to integrate the loss function of interest into the optimization process improving the final results according to the chosen measure of performance. Here, we focus on the task of constituency parsing and show how to optimize the model for the F -score in the max-margin framework of a structural support vector machine (SVM). For reasons of computational efficiency, it is a common approach to binarize the corresponding grammar before training. Unfortunately, this introduces a bias during the training procedure as the corresponding loss function is evaluated on the binary representation, while the resulting performance is measured on the original unbinarized trees. Here, we address this problem by extending the inference procedure presented by Bauer et al. Specifically, we propose an algorithmic modification that allows evaluating the loss on the unbinarized trees. The new approach properly models the loss function of interest resulting in better prediction accuracy and still benefits from the computational efficiency due to binarized representation. The presented idea can be easily transferred to other structured loss functions.
自然语言处理领域中的许多学习任务,包括序列标记、序列分割和句法分析,都已通过结构化预测方法成功解决。相应训练算法的一个吸引人的特性是它们能够将感兴趣的损失函数集成到优化过程中,从而根据所选的性能度量来提高最终结果。在这里,我们专注于成分句法分析任务,并展示如何在结构支持向量机(SVM)的最大间隔框架中针对F值优化模型。出于计算效率的考虑,在训练前对相应的语法进行二值化是一种常见的方法。不幸的是,这在训练过程中引入了偏差,因为相应的损失函数是在二进制表示上进行评估的,而最终的性能是在原始的未二值化的树状结构上进行测量的。在这里,我们通过扩展鲍尔等人提出的推理过程来解决这个问题。具体来说,我们提出了一种算法修改,允许在未二值化的树状结构上评估损失。新方法正确地对感兴趣的损失函数进行建模,从而提高预测准确性,并且由于二值化表示仍然受益于计算效率。所提出的想法可以很容易地转移到其他结构化损失函数上。