Suppr超能文献

探索阿姆哈拉语到英语机器翻译的发展历程与未来前景:一项系统综述。

Exploring the evolution and future prospects of Amharic to English machine translation: a systematic review.

作者信息

Asebel Muluken Hussen, Assefa Shimelis Getu, Haile Mesfin Abebe

机构信息

Department of Computer Science and Engineering, School of Electrical Engineering and Computing, Adama Science and Technology University, Adama, Ethiopia.

Department of Research Methods and Information Science, Morgridge College of Education, University of Denver, Denver, CO, United States.

出版信息

Front Artif Intell. 2025 May 23;8:1456245. doi: 10.3389/frai.2025.1456245. eCollection 2025.

Abstract

INTRODUCTION

In the last couple of decades, Amharic-English translation has greatly improved from a rule-based approach to contemporary systems that apply neural networks. Even after these advancements, problems remain because of the Amharic language's resource-scarce nature, such as inadequate datasets, tools for working with the language, and the intricate semantics and grammar of Amharic as compared to English. This systematic review seeks to analyze the evolution of the Amharic-English machine translation, the prominent ongoing difficulties, the noteworthy research undertakings, and the prospects of the research focus.

METHODS

This review uses a systematic approach to study the literature on Amharic-English machine translation. Important documents were retrieved from academic websites, and those with relevance to the methodologies of machine translation, language resources development, and evaluation practices were chosen. Primarily, the focus was on both statistical and neural machine translation models, especially those with transformer structures.

RESULTS

The initial attempts to translate English to Amharic and vice-versa relied on statistic machine translation (SMT), which set the stage for the evolution to neural machine translation (NMT). The use of transformer models has impacted the accuracy and fluidity of translations tremendously. Still, there is a lack of sufficient parallel corpora, effective methods for tokenization of Amharic, and other resources. Recently, the focus has been on creating new datasets, improving token-level engineering, and modifying NMT models for Amharic's complex morphological structure.

DISCUSSION

The complete solutions for enhancing Amharic-English translation remain elusive and include the lack of sufficient data, semantic correspondence, and grammatical consistency within and across translations. Pursuable avenues include augmentation of data, tokenization on the language level, and incorporation of linguistic elements into the parallel corpora. In addition, creating effective evaluation frameworks along with comprehensive linguistic data is important for assessing and improving translation tools. With these changes, cross-cultural interaction and increasing accessibility to modern technologies will be achieved.

摘要

引言

在过去几十年里,阿姆哈拉语-英语翻译已从基于规则的方法大幅改进为应用神经网络的当代系统。即便有了这些进步,但由于阿姆哈拉语资源稀缺的特性,问题依然存在,比如数据集不足、处理该语言的工具匮乏,以及与英语相比阿姆哈拉语复杂的语义和语法。本系统综述旨在分析阿姆哈拉语-英语机器翻译的演变、当前突出的困难、值得注意的研究工作以及研究重点的前景。

方法

本综述采用系统方法研究有关阿姆哈拉语-英语机器翻译的文献。从学术网站检索重要文档,并选取与机器翻译方法、语言资源开发及评估实践相关的文档。主要关注统计机器翻译和神经机器翻译模型,尤其是具有Transformer结构的模型。

结果

最初将英语翻译成阿姆哈拉语以及反之的尝试依赖于统计机器翻译(SMT),这为向神经机器翻译(NMT)的演变奠定了基础。Transformer模型的使用极大地影响了翻译的准确性和流畅性。然而,仍然缺乏足够的平行语料库、有效的阿姆哈拉语分词方法以及其他资源。最近,重点一直放在创建新数据集、改进词元级工程以及针对阿姆哈拉语复杂的形态结构修改神经机器翻译模型上。

讨论

增强阿姆哈拉语-英语翻译的完整解决方案仍然难以捉摸,包括缺乏足够的数据、语义对应以及翻译内部和之间的语法一致性。可探索的途径包括扩充数据、语言层面的分词以及将语言元素纳入平行语料库。此外,创建有效的评估框架以及全面的语言数据对于评估和改进翻译工具很重要。通过这些改变,将实现跨文化互动并提高对现代技术的可及性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bb11/12141271/852c33569c90/frai-08-1456245-g001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验