Suppr超能文献

利用句间信息实现更好的问题驱动摘要生成:算法开发与验证

Exploiting Intersentence Information for Better Question-Driven Abstractive Summarization: Algorithm Development and Validation.

作者信息

Wang Xin, Wang Jian, Xu Bo, Lin Hongfei, Zhang Bo, Yang Zhihao

机构信息

School of Computer Science and Technology, Dalian University of Technology, Dalian, China.

出版信息

JMIR Med Inform. 2022 Aug 15;10(8):e38052. doi: 10.2196/38052.

Abstract

BACKGROUND

Question-driven summarization has become a practical and accurate approach to summarizing the source document. The generated summary should be concise and consistent with the concerned question, and thus, it could be regarded as the answer to the nonfactoid question. Existing methods do not fully exploit question information over documents and dependencies across sentences. Besides, most existing summarization evaluation tools like recall-oriented understudy for gisting evaluation (ROUGE) calculate N-gram overlaps between the generated summary and the reference summary while neglecting the factual consistency problem.

OBJECTIVE

This paper proposes a novel question-driven abstractive summarization model based on transformer, including a two-step attention mechanism and an overall integration mechanism, which can generate concise and consistent summaries for nonfactoid question answering.

METHODS

Specifically, the two-step attention mechanism is proposed to exploit the mutual information both of question to context and sentence over other sentences. We further introduced an overall integration mechanism and a novel pointer network for information integration. We conducted a question-answering task to evaluate the factual consistency between the generated summary and the reference summary.

RESULTS

The experimental results of question-driven summarization on the PubMedQA data set showed that our model achieved ROUGE-1, ROUGE-2, and ROUGE-L measures of 36.01, 15.59, and 30.22, respectively, which is superior to the state-of-the-art methods with a gain of 0.79 (absolute) in the ROUGE-2 score. The question-answering task demonstrates that the generated summaries of our model have better factual constancy. Our method achieved 94.2% accuracy and a 77.57% F1 score.

CONCLUSIONS

Our proposed question-driven summarization model effectively exploits the mutual information among the question, document, and summary to generate concise and consistent summaries.

摘要

背景

问题驱动的摘要生成已成为一种实用且准确的源文档摘要方法。生成的摘要应简洁并与相关问题一致,因此可被视为对非事实性问题的答案。现有方法未充分利用文档中的问题信息以及句子间的依存关系。此外,大多数现有的摘要评估工具,如面向召回率的摘要评估辅助工具(ROUGE),在计算生成的摘要与参考摘要之间的N元语法重叠时,忽略了事实一致性问题。

目的

本文提出一种基于Transformer的新型问题驱动的抽象摘要模型,包括两步注意力机制和整体整合机制,可针对非事实性问答生成简洁且一致的摘要。

方法

具体而言,提出两步注意力机制以利用问题与上下文之间以及句子与其他句子之间的互信息。我们进一步引入了整体整合机制和用于信息整合的新型指针网络。我们进行了问答任务以评估生成的摘要与参考摘要之间的事实一致性。

结果

在PubMedQA数据集上进行的问题驱动摘要生成实验结果表明,我们的模型在ROUGE-1、ROUGE-2和ROUGE-L指标上分别达到了36.01、15.59和30.22,优于现有最先进的方法,在ROUGE-2分数上提高了0.79(绝对值)。问答任务表明,我们模型生成的摘要具有更好的事实稳定性。我们的方法准确率达到94.2%,F1分数为77.57%。

结论

我们提出的问题驱动摘要模型有效利用了问题、文档和摘要之间的互信息,以生成简洁且一致的摘要。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f834/9425173/f2d6ba077477/medinform_v10i8e38052_fig1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验