• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

探索ChatGPT在多个领域增强机器翻译后编辑方面的潜力:挑战与机遇。

Exploring ChatGPT's potential for augmenting post-editing in machine translation across multiple domains: challenges and opportunities.

作者信息

Algaraady Jeehaan, Mahyoob Mohammad

机构信息

Department of Languages and Translation, Taiz University, Taiz, Yemen.

Department of Languages and Translation, Taibah University, Madina, Saudi Arabia.

出版信息

Front Artif Intell. 2025 May 1;8:1526293. doi: 10.3389/frai.2025.1526293. eCollection 2025.

DOI:10.3389/frai.2025.1526293
PMID:40376281
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12078335/
Abstract

INTRODUCTION

Post-editing plays a crucial role in enhancing the quality of machine-generated translation (MGT) by correcting errors and ensuring cohesion and coherence. With advancements in artificial intelligence, Large Language Models (LLMs) like ChatGPT-4o offer promising capabilities for post-editing tasks. This study investigates the effectiveness of ChatGPT-4o as a natural language processing tool in post-editing Arabic translations across various domains, aiming to evaluate its performance in improving productivity, accuracy, consistency, and overall translation quality.

METHODS

The study involved a comparative analysis of Arabic translations generated by Google Translate. These texts, drawn from multiple domains, were post-edited by two professional human translators and ChatGPT-4o. Subsequently, three additional professional human post-editors evaluated both sets of post-edited outputs. To statistically assess the differences in quality between humans and ChatGPT-4o post-edits, a paired -test was employed, focusing on metrics such as fluency, accuracy, coherence, and efficiency.

RESULTS

The findings indicated that human post-editors outperformed ChatGPT-4o in most quality metrics. However, ChatGPT-4o demonstrated superior efficiency, yielding a positive -statistic of 8.00 and a -value of 0.015, indicating a statistically significant difference. Regarding fluency, no significant difference was observed between the two methods (-statistic = -3.5, -value = 0.074), suggesting comparable performance in ensuring the natural flow of text.

DISCUSSION

ChatGPT-4o showed competitive performance in English-to-Arabic post-editing, particularly in producing fluent, coherent, and stylistically consistent text. Its conversational design enables efficient and consistent editing across various domains. Nonetheless, the model faced challenges in handling grammatical and syntactic nuances, domain-specific idioms, and complex terminology, especially in medical and sports contexts. Overall, the study highlights the potential of ChatGPT-4o as a supportive tool in translation post-editing workflows, complementing human translators by enhancing productivity and maintaining acceptable quality standards.

摘要

引言

译后编辑在通过纠正错误以及确保衔接与连贯来提高机器生成翻译(MGT)的质量方面发挥着关键作用。随着人工智能的发展,像ChatGPT-4o这样的大语言模型在译后编辑任务中展现出了颇具前景的能力。本研究调查了ChatGPT-4o作为一种自然语言处理工具在跨领域译后编辑阿拉伯语翻译中的有效性,旨在评估其在提高生产力、准确性、一致性和整体翻译质量方面的表现。

方法

该研究涉及对谷歌翻译生成的阿拉伯语翻译进行对比分析。这些来自多个领域的文本由两名专业人工翻译人员和ChatGPT-4o进行译后编辑。随后,另外三名专业人工译后编辑人员对两组译后编辑输出进行了评估。为了从统计学角度评估人工和ChatGPT-4o译后编辑在质量上的差异,采用了配对检验,重点关注流畅性、准确性、连贯性和效率等指标。

结果

研究结果表明,在大多数质量指标上,人工译后编辑人员的表现优于ChatGPT-4o。然而,ChatGPT-4o展现出了更高的效率,其t统计量为8.00,p值为0.015,表明存在统计学上的显著差异。在流畅性方面,两种方法之间未观察到显著差异(t统计量 = -3.5,p值 = 0.074),这表明在确保文本自然流畅方面表现相当。

讨论

ChatGPT-4o在英阿译后编辑中表现出了具有竞争力的性能,尤其是在生成流畅、连贯且文体一致的文本方面。其对话式设计能够在各个领域实现高效且一致的编辑。尽管如此,该模型在处理语法和句法细微差别、特定领域习语以及复杂术语方面面临挑战,尤其是在医学和体育领域。总体而言,该研究凸显了ChatGPT-4o作为翻译译后编辑工作流程中的一种辅助工具的潜力,通过提高生产力和维持可接受的质量标准来补充人工翻译人员的工作。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/184b/12078335/771835c97182/frai-08-1526293-g0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/184b/12078335/df5fab363277/frai-08-1526293-g0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/184b/12078335/771835c97182/frai-08-1526293-g0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/184b/12078335/df5fab363277/frai-08-1526293-g0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/184b/12078335/771835c97182/frai-08-1526293-g0002.jpg

相似文献

1
Exploring ChatGPT's potential for augmenting post-editing in machine translation across multiple domains: challenges and opportunities.探索ChatGPT在多个领域增强机器翻译后编辑方面的潜力:挑战与机遇。
Front Artif Intell. 2025 May 1;8:1526293. doi: 10.3389/frai.2025.1526293. eCollection 2025.
2
ChatGPT's Performance on Portuguese Medical Examination Questions: Comparative Analysis of ChatGPT-3.5 Turbo and ChatGPT-4o Mini.ChatGPT在葡萄牙语医学考试问题上的表现:ChatGPT-3.5 Turbo与ChatGPT-4o Mini的比较分析。
JMIR Med Educ. 2025 Mar 5;11:e65108. doi: 10.2196/65108.
3
Patient Triage and Guidance in Emergency Departments Using Large Language Models: Multimetric Study.使用大语言模型在急诊科进行患者分诊和指导:多指标研究
J Med Internet Res. 2025 May 15;27:e71613. doi: 10.2196/71613.
4
Comparative Analysis of ChatGPT-4o and Gemini Advanced Performance on Diagnostic Radiology In-Training Exams.ChatGPT-4o与Gemini在放射诊断学培训考试中的性能对比分析
Cureus. 2025 Mar 20;17(3):e80874. doi: 10.7759/cureus.80874. eCollection 2025 Mar.
5
Performance of ChatGPT on Nursing Licensure Examinations in the United States and China: Cross-Sectional Study.ChatGPT 在中美护理执照考试中的表现:横断面研究。
JMIR Med Educ. 2024 Oct 3;10:e52746. doi: 10.2196/52746.
6
Evaluating the Efficacy of Large Language Models in Generating Medical Documentation: A Comparative Study of ChatGPT-4, ChatGPT-4o, and Claude.评估大语言模型在生成医学文档方面的功效:ChatGPT-4、ChatGPT-4o和Claude的比较研究
Aesthetic Plast Surg. 2025 Apr 14. doi: 10.1007/s00266-025-04842-8.
7
Evaluating text and visual diagnostic capabilities of large language models on questions related to the Breast Imaging Reporting and Data System Atlas 5 edition.评估大语言模型在与《乳腺影像报告和数据系统》第5版相关问题上的文本和视觉诊断能力。
Diagn Interv Radiol. 2025 Mar 3;31(2):111-129. doi: 10.4274/dir.2024.242876. Epub 2024 Sep 9.
8
Comparing performances of french orthopaedic surgery residents with the artificial intelligence ChatGPT-4/4o in the French diploma exams of orthopaedic and trauma surgery.在法国骨科与创伤外科文凭考试中,比较法国骨科住院医师与人工智能ChatGPT-4/4o的表现。
Orthop Traumatol Surg Res. 2024 Dec 4:104080. doi: 10.1016/j.otsr.2024.104080.
9
Comparative performance of artificial intelligence models in rheumatology board-level questions: evaluating Google Gemini and ChatGPT-4o.人工智能模型在风湿病委员会级问题中的比较性能:评估 Google Gemini 和 ChatGPT-4o。
Clin Rheumatol. 2024 Nov;43(11):3507-3513. doi: 10.1007/s10067-024-07154-5. Epub 2024 Sep 28.
10
Enhancing CT examination efficiency with ChatGPT-4o for multilingual Hajj pilgrims: A short communication.利用ChatGPT-4o提高多语言朝觐者的CT检查效率:一篇简短通讯
J Med Imaging Radiat Sci. 2025 Jan;56(1):101781. doi: 10.1016/j.jmir.2024.101781. Epub 2024 Oct 19.