PerAnSel：一种基于新型深度神经网络的波斯语问答系统。

PerAnSel: A Novel Deep Neural Network-Based System for Persian Question Answering.

机构信息

Big Data Research Group, Faculty of Computer Engineering, University of Isfahan, Isfahan, Iran.

Department of Linguistics, Faculty of Foreign Languages, University of Isfahan, Isfahan, Iran.

出版信息

Comput Intell Neurosci. 2022 Jul 18;2022:3661286. doi: 10.1155/2022/3661286. eCollection 2022.

DOI:10.1155/2022/3661286

PMID:35898771

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9313912/

Abstract

Question answering (QA) systems have attracted considerable attention in recent years. They receive the user's questions in natural language and respond to them with precise answers. Most of the works on QA were initially proposed for the English language, but some research studies have recently been performed on non-English languages. Answer selection (AS) is a critical component in QA systems. To the best of our knowledge, there is no research on AS for the Persian language. Persian is a (1) free word order, (2) right-to-left, (3) morphologically rich, and (4) low-resource language. Deep learning (DL) techniques have shown promising accuracy in AS. Although DL performs very well on QA, it requires a considerable amount of annotated data for training. Many annotated datasets have been built for the AS task; most of them are exclusively in English. In order to address the need for a high-quality AS dataset in the Persian language, we present PASD; the first large-scale native AS dataset for the Persian language. To show the quality of PASD, we employed it to train state-of-the-art QA systems. We also present PerAnSel: a novel deep neural network-based system for Persian question answering. Since the Persian language is a free word-order language, in PerAnSel, we parallelize a sequential method and a transformer-based method to handle various orders in the Persian language. We then evaluate PerAnSel on three datasets: PASD, PerCQA, and WikiFA. The experimental results indicate strong performance on the Persian datasets beating state-of-the-art answer selection methods by 10.66% on PASD, 8.42% on PerCQA, and 3.08% on WikiFA datasets in terms of MRR.

摘要

问答 (QA) 系统近年来引起了相当大的关注。它们以自然语言接收用户的问题，并以精确的答案回答。大多数关于 QA 的工作最初是针对英语提出的，但最近也有一些关于非英语语言的研究。答案选择 (AS) 是 QA 系统的一个关键组成部分。据我们所知，目前还没有针对波斯语的 AS 研究。波斯语是一种（1）自由词序、（2）从右到左、（3）形态丰富、（4）资源匮乏的语言。深度学习 (DL) 技术在 AS 中表现出了有希望的准确性。尽管 DL 在 QA 中表现非常出色，但它需要大量的标注数据进行训练。已经为 AS 任务构建了许多标注数据集；其中大多数都是专门用于英语的。为了满足对波斯语高质量 AS 数据集的需求，我们提出了 PASD；这是第一个针对波斯语的大规模本地 AS 数据集。为了展示 PASD 的质量，我们使用它来训练最先进的 QA 系统。我们还提出了 PerAnSel：一种用于波斯语问答的新型基于深度神经网络的系统。由于波斯语是一种自由词序语言，在 PerAnSel 中，我们并行化了一种基于序列的方法和一种基于转换器的方法，以处理波斯语中的各种词序。然后，我们在三个数据集上评估 PerAnSel：PASD、PerCQA 和 WikiFA。实验结果表明，在波斯语数据集上的表现非常出色，在 PASD 上比最先进的答案选择方法高出 10.66%，在 PerCQA 上高出 8.42%，在 WikiFA 上高出 3.08%，在 MRR 方面。

相似文献

PerAnSel: A Novel Deep Neural Network-Based System for Persian Question Answering.PerAnSel：一种基于新型深度神经网络的波斯语问答系统。

Comput Intell Neurosci. 2022 Jul 18;2022:3661286. doi: 10.1155/2022/3661286. eCollection 2022.

SemBioNLQA: A semantic biomedical question answering system for retrieving exact and ideal answers to natural language questions.SemBioNLQA：一个语义生物医学问答系统，用于检索自然语言问题的准确和理想答案。

Artif Intell Med. 2020 Jan;102:101767. doi: 10.1016/j.artmed.2019.101767. Epub 2019 Nov 28.

Answering medical questions in Chinese using automatically mined knowledge and deep neural networks: an end-to-end solution.利用自动挖掘的知识和深度神经网络用中文回答医学问题：一种端到端的解决方案。

BMC Bioinformatics. 2022 Apr 15;23(1):136. doi: 10.1186/s12859-022-04658-2.

A question-entailment approach to question answering.问题蕴涵方法在问答中的应用。

BMC Bioinformatics. 2019 Oct 22;20(1):511. doi: 10.1186/s12859-019-3119-4.

Reading comprehension based question answering system in Bangla language with transformer-based learning.基于基于变压器学习的孟加拉语阅读理解问答系统。

Heliyon. 2022 Oct 12;8(10):e11052. doi: 10.1016/j.heliyon.2022.e11052. eCollection 2022 Oct.

A subgraph-representation-based method for answering complex questions over knowledge bases.基于子图表示的方法，用于回答知识库中的复杂问题。

Neural Netw. 2019 Nov;119:57-65. doi: 10.1016/j.neunet.2019.07.014. Epub 2019 Jul 26.

Word embeddings and external resources for answer processing in biomedical factoid question answering.词向量和外部资源在生物医学事实问答中的答案处理

J Biomed Inform. 2019 Apr;92:103118. doi: 10.1016/j.jbi.2019.103118. Epub 2019 Feb 10.

CapsTM: capsule network for Chinese medical text matching.CapsTM：用于中文医疗文本匹配的胶囊网络。

BMC Med Inform Decis Mak. 2021 Jul 30;21(Suppl 2):94. doi: 10.1186/s12911-021-01442-9.

A passage retrieval method based on probabilistic information retrieval model and UMLS concepts in biomedical question answering.一种基于概率信息检索模型和统一医学语言系统（UMLS）概念的生物医学问答中的段落检索方法。

J Biomed Inform. 2017 Apr;68:96-103. doi: 10.1016/j.jbi.2017.03.001. Epub 2017 Mar 7.

Hierarchical fusion of common sense knowledge and classifier decisions for answer selection in community question answering.常识知识和分类器决策的层次融合在社区问答中的答案选择。

Neural Netw. 2020 Dec;132:53-65. doi: 10.1016/j.neunet.2020.08.005. Epub 2020 Aug 20.

引用本文的文献

Improving the quality of Persian clinical text with a novel spelling correction system.利用新型拼写纠错系统提高波斯语临床文本质量。

BMC Med Inform Decis Mak. 2024 Aug 5;24(1):220. doi: 10.1186/s12911-024-02613-0.

本文引用的文献

Siamese Neural Networks: An Overview.暹罗神经网络：概述。

Methods Mol Biol. 2021;2190:73-94. doi: 10.1007/978-1-0716-0826-5_3.

Long short-term memory.长短期记忆

Neural Comput. 1997 Nov 15;9(8):1735-80. doi: 10.1162/neco.1997.9.8.1735.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

PerAnSel：一种基于新型深度神经网络的波斯语问答系统。

PerAnSel: A Novel Deep Neural Network-Based System for Persian Question Answering.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献