Suppr超能文献

PerAnSel:一种基于新型深度神经网络的波斯语问答系统。

PerAnSel:  A  Novel Deep Neural Network-Based System for Persian Question Answering.

机构信息

Big Data Research Group, Faculty of Computer Engineering, University of Isfahan, Isfahan, Iran.

Department of Linguistics, Faculty of Foreign Languages, University of Isfahan, Isfahan, Iran.

出版信息

Comput Intell Neurosci. 2022 Jul 18;2022:3661286. doi: 10.1155/2022/3661286. eCollection 2022.

Abstract

Question answering (QA) systems have attracted considerable attention in recent years. They receive the user's questions in natural language and respond to them with precise answers. Most of the works on QA were initially proposed for the English language, but some research studies have recently been performed on non-English languages. Answer selection (AS) is a critical component in QA systems. To the best of our knowledge, there is no research on AS for the Persian language. Persian is a (1) free word order, (2) right-to-left, (3) morphologically rich, and (4) low-resource language. Deep learning (DL) techniques have shown promising accuracy in AS. Although DL performs very well on QA, it requires a considerable amount of annotated data for training. Many annotated datasets have been built for the AS task; most of them are exclusively in English. In order to address the need for a high-quality AS dataset in the Persian language, we present PASD; the first large-scale native AS dataset for the Persian language. To show the quality of PASD, we employed it to train state-of-the-art QA systems. We also present PerAnSel: a novel deep neural network-based system for Persian question answering. Since the Persian language is a free word-order language, in PerAnSel, we parallelize a sequential method and a transformer-based method to handle various orders in the Persian language. We then evaluate PerAnSel on three datasets: PASD, PerCQA, and WikiFA. The experimental results indicate strong performance on the Persian datasets beating state-of-the-art answer selection methods by 10.66% on PASD, 8.42% on PerCQA, and 3.08% on WikiFA datasets in terms of MRR.

摘要

问答 (QA) 系统近年来引起了相当大的关注。它们以自然语言接收用户的问题,并以精确的答案回答。大多数关于 QA 的工作最初是针对英语提出的,但最近也有一些关于非英语语言的研究。答案选择 (AS) 是 QA 系统的一个关键组成部分。据我们所知,目前还没有针对波斯语的 AS 研究。波斯语是一种(1)自由词序、(2)从右到左、(3)形态丰富、(4)资源匮乏的语言。深度学习 (DL) 技术在 AS 中表现出了有希望的准确性。尽管 DL 在 QA 中表现非常出色,但它需要大量的标注数据进行训练。已经为 AS 任务构建了许多标注数据集;其中大多数都是专门用于英语的。为了满足对波斯语高质量 AS 数据集的需求,我们提出了 PASD;这是第一个针对波斯语的大规模本地 AS 数据集。为了展示 PASD 的质量,我们使用它来训练最先进的 QA 系统。我们还提出了 PerAnSel:一种用于波斯语问答的新型基于深度神经网络的系统。由于波斯语是一种自由词序语言,在 PerAnSel 中,我们并行化了一种基于序列的方法和一种基于转换器的方法,以处理波斯语中的各种词序。然后,我们在三个数据集上评估 PerAnSel:PASD、PerCQA 和 WikiFA。实验结果表明,在波斯语数据集上的表现非常出色,在 PASD 上比最先进的答案选择方法高出 10.66%,在 PerCQA 上高出 8.42%,在 WikiFA 上高出 3.08%,在 MRR 方面。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验