Suppr超能文献

基于优势行动者-评论者和图矩阵方法的抽取式文本摘要模型。

Extractive text summarization model based on advantage actor-critic and graph matrix methodology.

作者信息

Yang Senqi, Duan Xuliang, Wang Xi, Tang Dezhao, Xiao Zeyan, Guo Yan

机构信息

College of Information and Engineering, Sichuan Agricultural University, Ya'an, China.

The Lab of Agricultural Information Engineering, Sichuan Key Laboratory, Ya'an, China.

出版信息

Math Biosci Eng. 2023 Jan;20(1):1488-1504. doi: 10.3934/mbe.2023067. Epub 2022 Oct 31.

Abstract

The automatic text summarization task faces great challenges. The main issue in the area is to identify the most informative segments in the input text. Establishing an effective evaluation mechanism has also been identified as a major challenge in the area. Currently, the mainstream solution is to use deep learning for training. However, a serious exposure bias in training prevents them from achieving better results. Therefore, this paper introduces an extractive text summarization model based on a graph matrix and advantage actor-critic (GA2C) method. The articles were pre-processed to generate a graph matrix. Based on the states provided by the graph matrix, the decision-making network made decisions and sent the results to the evaluation network for scoring. The evaluation network got the decision results of the decision-making network and then scored them. The decision-making network modified the probability of the action based on the scores of the evaluation network. Specifically, compared with the baseline reinforcement learning-based extractive summarization (Refresh) model, experimental results on the CNN/Daily Mail dataset showed that the GA2C model led on Rouge-1, Rouge-2 and Rouge-A by 0.70, 9.01 and 2.73, respectively. Moreover, we conducted multiple ablation experiments to verify the GA2C model from different perspectives. Different activation functions and evaluation networks were used in the GA2C model to obtain the best activation function and evaluation network. Two different reward functions (Set fixed reward value for accumulation (ADD), Rouge) and two different similarity matrices (cosine, Jaccard) were combined for the experiments.

摘要

自动文本摘要任务面临着巨大的挑战。该领域的主要问题是识别输入文本中信息量最大的部分。建立有效的评估机制也被视为该领域的一项重大挑战。目前,主流的解决方案是使用深度学习进行训练。然而,训练中严重的曝光偏差阻碍了它们取得更好的效果。因此,本文介绍了一种基于图矩阵和优势行动者-评论者(GA2C)方法的抽取式文本摘要模型。对文章进行预处理以生成图矩阵。决策网络根据图矩阵提供的状态做出决策,并将结果发送到评估网络进行评分。评估网络获取决策网络的决策结果,然后对其进行评分。决策网络根据评估网络的分数修改行动的概率。具体而言,与基于强化学习的基线抽取式摘要(Refresh)模型相比,在CNN/每日邮报数据集上的实验结果表明,GA2C模型在Rouge-1、Rouge-2和Rouge-A上分别领先0.70、9.01和2.73。此外,我们进行了多次消融实验,从不同角度验证GA2C模型。在GA2C模型中使用不同的激活函数和评估网络,以获得最佳的激活函数和评估网络。将两种不同的奖励函数(为累积设置固定奖励值(ADD)、Rouge)和两种不同的相似性矩阵(余弦、杰卡德)结合起来进行实验。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验