混合层级神经机器翻译

Mixed-Level Neural Machine Translation.

作者信息

Nguyen Thien, Nguyen Huu, Tran Phuoc

机构信息

Faculty of Information Technology, Ton Duc Thang University, Ho Chi Minh City, Vietnam.

Faculty of Information Technology, Ho Chi Minh City University of Food Industry, Ho Chi Minh City, Vietnam.

出版信息

Comput Intell Neurosci. 2020 Nov 29;2020:8859452. doi: 10.1155/2020/8859452. eCollection 2020.

DOI:10.1155/2020/8859452

PMID:33335545

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7722455/

Abstract

Building the first Russian-Vietnamese neural machine translation system, we faced the problem of choosing a translation unit system on which source and target embeddings are based. Available homogeneous translation unit systems with the same translation unit on the source and target sides do not perfectly suit the investigated language pair. To solve the problem, in this paper, we propose a novel heterogeneous translation unit system, considering linguistic characteristics of the synthetic Russian language and the analytic Vietnamese language. Specifically, we decrease the embedding level on the source side by splitting token into subtokens and increase the embedding level on the target side by merging neighboring tokens into supertoken. The experiment results show that the proposed heterogeneous system improves over the existing best homogeneous Russian-Vietnamese translation system by 1.17 BLEU. Our approach could be applied to building translation bots for language pairs with different linguistic characteristics.

摘要

在构建首个俄越神经机器翻译系统时，我们面临着选择源嵌入和目标嵌入所基于的翻译单元系统的问题。现有的在源端和目标端具有相同翻译单元的同构翻译单元系统并不完全适用于所研究的语言对。为了解决这个问题，在本文中，我们考虑到俄语合成语言和越南语分析语言的语言特点，提出了一种新颖的异构翻译单元系统。具体来说，我们通过将词元拆分为子词元来降低源端的嵌入级别，并通过将相邻词元合并为超级词元来提高目标端的嵌入级别。实验结果表明，所提出的异构系统比现有的最佳俄越同构翻译系统的BLEU得分提高了1.17。我们的方法可应用于为具有不同语言特点的语言对构建翻译机器人。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8aa1/7722455/e39f22adcf3a/CIN2020-8859452.alg.001.jpg

相似文献

Mixed-Level Neural Machine Translation.混合层级神经机器翻译

Comput Intell Neurosci. 2020 Nov 29;2020:8859452. doi: 10.1155/2020/8859452. eCollection 2020.

Heavyweight Statistical Alignment to Guide Neural Translation.重磅统计对齐引导神经翻译。

Comput Intell Neurosci. 2022 Jun 3;2022:6856567. doi: 10.1155/2022/6856567. eCollection 2022.

Pseudotext Injection and Advance Filtering of Low-Resource Corpus for Neural Machine Translation.用于神经机器翻译的低资源语料库的伪文本注入与预过滤

Comput Intell Neurosci. 2021 Apr 11;2021:6682385. doi: 10.1155/2021/6682385. eCollection 2021.

A Character Level Based and Word Level Based Approach for Chinese-Vietnamese Machine Translation.一种基于字符和基于词的汉越机器翻译方法。

Comput Intell Neurosci. 2016;2016:9821608. doi: 10.1155/2016/9821608. Epub 2016 Jun 29.

Machine Translation System Using Deep Learning for English to Urdu.基于深度学习的英语到乌尔都语机器翻译系统。

Comput Intell Neurosci. 2022 Jan 3;2022:7873012. doi: 10.1155/2022/7873012. eCollection 2022.

Neural machine translation of clinical texts between long distance languages.长距离语言之间的临床文本的神经机器翻译。

J Am Med Inform Assoc. 2019 Dec 1;26(12):1478-1487. doi: 10.1093/jamia/ocz110.

A Neural Machine Translation Model for Arabic Dialects That Utilises Multitask Learning (MTL).基于多任务学习 (MTL) 的阿拉伯语方言神经机器翻译模型。

Comput Intell Neurosci. 2018 Dec 10;2018:7534712. doi: 10.1155/2018/7534712. eCollection 2018.

Adaptation of machine translation for multilingual information retrieval in the medical domain.医学领域中用于多语言信息检索的机器翻译适配

Artif Intell Med. 2014 Jul;61(3):165-85. doi: 10.1016/j.artmed.2014.01.004. Epub 2014 Feb 5.

Telemedicine as a special case of machine translation.远程医疗是机器翻译的一个特殊案例。

Comput Med Imaging Graph. 2015 Dec;46 Pt 2:249-56. doi: 10.1016/j.compmedimag.2015.09.005. Epub 2015 Sep 30.

ParaMed: a parallel corpus for English-Chinese translation in the biomedical domain.ParaMed：一个用于生物医学领域英汉翻译的平行语料库。

BMC Med Inform Decis Mak. 2021 Sep 6;21(1):258. doi: 10.1186/s12911-021-01621-8.

引用本文的文献

Heavyweight Statistical Alignment to Guide Neural Translation.重磅统计对齐引导神经翻译。

Comput Intell Neurosci. 2022 Jun 3;2022:6856567. doi: 10.1155/2022/6856567. eCollection 2022.

本文引用的文献

A Character Level Based and Word Level Based Approach for Chinese-Vietnamese Machine Translation.一种基于字符和基于词的汉越机器翻译方法。

Comput Intell Neurosci. 2016;2016:9821608. doi: 10.1155/2016/9821608. Epub 2016 Jun 29.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

混合层级神经机器翻译

Mixed-Level Neural Machine Translation.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献