基于摘要输出区域的用于长度控制的抽象摘要的增强型变压器。

Enhanced transformer for length-controlled abstractive summarization based on summary output area.

作者信息

Sunusi Yusuf, Omar Nazlia, Zakaria Lailatul Qadri

机构信息

Center for Artificial Intelligence Technology, Faculty of Information Science and Technology, Universiti Kebangsaan Malaysia, Bangi, Selangor, Malaysia.

出版信息

PeerJ Comput Sci. 2025 Mar 11;11:e2667. doi: 10.7717/peerj-cs.2667. eCollection 2025.

DOI:10.7717/peerj-cs.2667

PMID:40134863

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11935775/

Abstract

Recent advancements in abstractive summarization models, particularly those built on encoder-decoder architectures, typically produce a single summary for each source text. Controlling the length of summaries is crucial for practical applications, such as crafting cover summaries for newspapers or magazines with varying slot sizes. Current research in length-controllable abstractive summarization employs techniques like length embeddings in the decoder module or a word-level extractive module in the encoder-decoder model. However, these approaches, while effective in determining when to halt decoding, fall short in selecting relevant information to include within the specified length constraint. This article diverges from prior models reliant on predefined lengths. Instead, it introduces a novel approach to length-controllable abstractive summarization by integrating an image processing phase. This phase determines the specific size of the summary output slot. The proposed model harnesses enhanced T5 and GPT models, seamlessly adapting summaries to designated slots. The computed area of a given slot is employed in both models to generate abstractive summaries tailored to fit the output slot perfectly. Experimental evaluations on the CNN/Daily Mail dataset demonstrate the model's success in performing length-controlled summarization, yielding superior results.

摘要

摘要生成模型的最新进展，尤其是那些基于编码器-解码器架构构建的模型，通常会为每个源文本生成一个单一的摘要。控制摘要的长度对于实际应用至关重要，例如为不同篇幅大小的报纸或杂志撰写封面摘要。当前关于长度可控的摘要生成的研究采用了解码器模块中的长度嵌入或编码器-解码器模型中的词级提取模块等技术。然而，这些方法虽然在确定何时停止解码方面有效，但在选择要包含在指定长度约束内的相关信息方面存在不足。本文与依赖预定义长度的先前模型不同。相反，它通过集成图像处理阶段引入了一种新的长度可控摘要生成方法。这个阶段确定摘要输出槽的特定大小。所提出的模型利用增强的T5和GPT模型，无缝地使摘要适应指定的槽。给定槽的计算面积在两个模型中都被用于生成完美适合输出槽的摘要。在CNN/每日邮报数据集上的实验评估证明了该模型在执行长度控制摘要生成方面的成功，产生了优异的结果。