Suppr超能文献

将候选答案提取与重排序相结合用于中文机器阅读理解。

Integrate Candidate Answer Extraction with Re-Ranking for Chinese Machine Reading Comprehension.

作者信息

Zeng Junjie, Sun Xiaoya, Zhang Qi, Li Xinmeng

机构信息

College of Systems Engineering, National University of Defense Technology, Changsha 410073, China.

出版信息

Entropy (Basel). 2021 Mar 8;23(3):322. doi: 10.3390/e23030322.

Abstract

Machine Reading Comprehension (MRC) research concerns how to endow machines with the ability to understand given passages and answer questions, which is a challenging problem in the field of natural language processing. To solve the Chinese MRC task efficiently, this paper proposes an Improved Extraction-based Reading Comprehension method with Answer Re-ranking (IERC-AR), consisting of a candidate answer extraction module and a re-ranking module. The candidate answer extraction module uses an improved pre-training language model, RoBERTa-WWM, to generate precise word representations, which can solve the problem of polysemy and is good for capturing Chinese word-level features. The re-ranking module re-evaluates candidate answers based on a self-attention mechanism, which can improve the accuracy of predicting answers. Traditional machine-reading methods generally integrate different modules into a pipeline system, which leads to re-encoding problems and inconsistent data distribution between the training and testing phases; therefore, this paper proposes an end-to-end model architecture for IERC-AR to reasonably integrate the candidate answer extraction and re-ranking modules. The experimental results on the Les MMRC dataset show that IERC-AR outperforms state-of-the-art MRC approaches.

摘要

机器阅读理解(MRC)研究关注如何赋予机器理解给定段落并回答问题的能力,这是自然语言处理领域中一个具有挑战性的问题。为了高效地解决中文MRC任务,本文提出了一种带有答案重排的改进型基于提取的阅读理解方法(IERC-AR),它由候选答案提取模块和重排模块组成。候选答案提取模块使用改进的预训练语言模型RoBERTa-WWM来生成精确的词表示,这可以解决一词多义的问题,并且有利于捕捉中文词级特征。重排模块基于自注意力机制对候选答案进行重新评估,这可以提高答案预测的准确性。传统的机器阅读方法通常将不同的模块集成到一个管道系统中,这会导致重新编码问题以及训练和测试阶段之间的数据分布不一致;因此,本文为IERC-AR提出了一种端到端的模型架构,以合理地集成候选答案提取和重排模块。在Les MMRC数据集上的实验结果表明,IERC-AR优于当前最先进的MRC方法。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/29d1/8001296/a7b3074b0452/entropy-23-00322-g001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验