超越 Transformer：一种新颖的多项式固有注意（PIA）模型及其对神经机器翻译的重大影响。

Beyond the Transformer: A Novel Polynomial Inherent Attention (PIA) Model and Its Great Impact on Neural Machine Translation.

机构信息

Department of Computer Science, Prince Sultan University, Riyadh, Saudi Arabia.

出版信息

Comput Intell Neurosci. 2022 Sep 21;2022:1912750. doi: 10.1155/2022/1912750. eCollection 2022.

DOI:10.1155/2022/1912750

PMID:36188704

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9519290/

Abstract

This paper describes a novel polynomial inherent attention (PIA) model that outperforms all state-of-the-art transformer models on neural machine translation (NMT) by a wide margin. PIA is based on the simple idea that natural language sentences can be transformed into a special type of binary attention context vectors that accurately capture the semantic context and the relative dependencies between words in a sentence. The transformation is performed using a simple power-of-two polynomial transformation that maintains strict consistent positioning of words in the resulting vectors. It is shown how this transformation reduces the neural machine translation process to a simple neural polynomial regression model that provides excellent solutions to the alignment and positioning problems haunting transformer models. The test BELU scores obtained on the WMT-2014 data set are 75.07 BELU for the EN-FR data set and 66.35 BELU for the EN-DE data set-well above accuracies achieved by state-of-the-art transformer models for the same data sets. The improvements are, respectively, 65.7% and 87.42%.

摘要

本文描述了一种新颖的多项式固有注意力（PIA）模型，该模型在神经机器翻译（NMT）方面的表现优于所有最先进的转换器模型，优势非常明显。PIA 的基本思想是，自然语言句子可以转换为一种特殊类型的二进制注意力上下文向量，这些向量可以准确地捕捉句子中单词的语义上下文和相对依赖关系。这种转换是通过使用简单的 2 的幂次多项式变换来实现的，该变换保持了在结果向量中单词的严格一致定位。结果表明，这种变换如何将神经机器翻译过程简化为一个简单的神经多项式回归模型，该模型为困扰转换器模型的对齐和定位问题提供了出色的解决方案。在 WMT-2014 数据集上获得的测试 BLEU 分数为：EN-FR 数据集为 75.07 BLEU，EN-DE 数据集为 66.35 BLEU，均高于同一数据集的最先进转换器模型的准确率。分别提高了 65.7%和 87.42%。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7de9/9519290/3c2870afc465/CIN2022-1912750.001.jpg

相似文献

Beyond the Transformer: A Novel Polynomial Inherent Attention (PIA) Model and Its Great Impact on Neural Machine Translation.

Comput Intell Neurosci. 2022 Sep 21;2022:1912750. doi: 10.1155/2022/1912750. eCollection 2022.

Heavyweight Statistical Alignment to Guide Neural Translation.

Comput Intell Neurosci. 2022 Jun 3;2022:6856567. doi: 10.1155/2022/6856567. eCollection 2022.

A Transformer-Based Neural Machine Translation Model for Arabic Dialects That Utilizes Subword Units.

Sensors (Basel). 2021 Sep 29;21(19):6509. doi: 10.3390/s21196509.

Processing of Translation-Ambiguous Words by Chinese-English Bilinguals in Sentence Context.

J Psycholinguist Res. 2019 Oct;48(5):1133-1161. doi: 10.1007/s10936-019-09650-1.

Construction of English Translation Model Based on Neural Network Fuzzy Semantic Optimal Control.

Comput Intell Neurosci. 2022 May 2;2022:9308236. doi: 10.1155/2022/9308236. eCollection 2022.

Optimization of English Machine Translation by Deep Neural Network under Artificial Intelligence.

Comput Intell Neurosci. 2022 Apr 21;2022:2003411. doi: 10.1155/2022/2003411. eCollection 2022.

Adoption of Wireless Network and Artificial Intelligence Algorithm in Chinese-English Tense Translation.

Comput Intell Neurosci. 2022 Jun 11;2022:1662311. doi: 10.1155/2022/1662311. eCollection 2022.

Improving neural machine translation with POS-tag features for low-resource language pairs.

Heliyon. 2022 Aug 22;8(8):e10375. doi: 10.1016/j.heliyon.2022.e10375. eCollection 2022 Aug.

Neural sentence embedding models for semantic similarity estimation in the biomedical domain.

BMC Bioinformatics. 2019 Apr 11;20(1):178. doi: 10.1186/s12859-019-2789-2.

Predicting Semantic Similarity Between Clinical Sentence Pairs Using Transformer Models: Evaluation and Representational Analysis.

JMIR Med Inform. 2021 May 26;9(5):e23099. doi: 10.2196/23099.

引用本文的文献

Decoding the digital: a corpus-based study of simplifications and other translation universals in translated texts.

Front Psychol. 2025 May 16;16:1517107. doi: 10.3389/fpsyg.2025.1517107. eCollection 2025.

A diachronic study determining syntactic and semantic features of Urdu-English neural machine translation.

Heliyon. 2023 Nov 29;10(1):e22883. doi: 10.1016/j.heliyon.2023.e22883. eCollection 2024 Jan 15.

本文引用的文献

Attention in Natural Language Processing.

IEEE Trans Neural Netw Learn Syst. 2021 Oct;32(10):4291-4308. doi: 10.1109/TNNLS.2020.3019893. Epub 2021 Oct 5.

Neural Machine Translation with Deep Attention.

IEEE Trans Pattern Anal Mach Intell. 2020 Jan;42(1):154-163. doi: 10.1109/TPAMI.2018.2876404. Epub 2018 Oct 16.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

超越 Transformer：一种新颖的多项式固有注意（PIA）模型及其对神经机器翻译的重大影响。

Beyond the Transformer: A Novel Polynomial Inherent Attention (PIA) Model and Its Great Impact on Neural Machine Translation.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献