Suppr超能文献

超越 Transformer:一种新颖的多项式固有注意(PIA)模型及其对神经机器翻译的重大影响。

Beyond the Transformer: A Novel Polynomial Inherent Attention (PIA) Model and Its Great Impact on Neural Machine Translation.

机构信息

Department of Computer Science, Prince Sultan University, Riyadh, Saudi Arabia.

出版信息

Comput Intell Neurosci. 2022 Sep 21;2022:1912750. doi: 10.1155/2022/1912750. eCollection 2022.

Abstract

This paper describes a novel polynomial inherent attention (PIA) model that outperforms all state-of-the-art transformer models on neural machine translation (NMT) by a wide margin. PIA is based on the simple idea that natural language sentences can be transformed into a special type of binary attention context vectors that accurately capture the semantic context and the relative dependencies between words in a sentence. The transformation is performed using a simple power-of-two polynomial transformation that maintains strict consistent positioning of words in the resulting vectors. It is shown how this transformation reduces the neural machine translation process to a simple neural polynomial regression model that provides excellent solutions to the alignment and positioning problems haunting transformer models. The test BELU scores obtained on the WMT-2014 data set are 75.07 BELU for the EN-FR data set and 66.35 BELU for the EN-DE data set-well above accuracies achieved by state-of-the-art transformer models for the same data sets. The improvements are, respectively, 65.7% and 87.42%.

摘要

本文描述了一种新颖的多项式固有注意力(PIA)模型,该模型在神经机器翻译(NMT)方面的表现优于所有最先进的转换器模型,优势非常明显。PIA 的基本思想是,自然语言句子可以转换为一种特殊类型的二进制注意力上下文向量,这些向量可以准确地捕捉句子中单词的语义上下文和相对依赖关系。这种转换是通过使用简单的 2 的幂次多项式变换来实现的,该变换保持了在结果向量中单词的严格一致定位。结果表明,这种变换如何将神经机器翻译过程简化为一个简单的神经多项式回归模型,该模型为困扰转换器模型的对齐和定位问题提供了出色的解决方案。在 WMT-2014 数据集上获得的测试 BLEU 分数为:EN-FR 数据集为 75.07 BLEU,EN-DE 数据集为 66.35 BLEU,均高于同一数据集的最先进转换器模型的准确率。分别提高了 65.7%和 87.42%。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7de9/9519290/3c2870afc465/CIN2022-1912750.001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验