Suppr超能文献

面向消费者健康问题答案的问题驱动式总结。

Question-driven summarization of answers to consumer health questions.

机构信息

Lister Hill National Center for Biomedical Communications, U.S. National Library of Medicine, National Institutes of Health, Bethesda, MD, USA.

出版信息

Sci Data. 2020 Oct 2;7(1):322. doi: 10.1038/s41597-020-00667-z.

Abstract

Automatic summarization of natural language is a widely studied area in computer science, one that is broadly applicable to anyone who needs to understand large quantities of information. In the medical domain, automatic summarization has the potential to make health information more accessible to people without medical expertise. However, to evaluate the quality of summaries generated by summarization algorithms, researchers first require gold standard, human generated summaries. Unfortunately there is no available data for the purpose of assessing summaries that help consumers of health information answer their questions. To address this issue, we present the MEDIQA-Answer Summarization dataset, the first dataset designed for question-driven, consumer-focused summarization. It contains 156 health questions asked by consumers, answers to these questions, and manually generated summaries of these answers. The dataset's unique structure allows it to be used for at least eight different types of summarization evaluations. We also benchmark the performance of baseline and state-of-the-art deep learning approaches on the dataset, demonstrating how it can be used to evaluate automatically generated summaries.

摘要

自然语言自动摘要在计算机科学领域是一个备受研究的领域,广泛适用于任何需要理解大量信息的人。在医学领域,自动摘要有可能使没有医学专业知识的人更容易获得健康信息。然而,为了评估摘要算法生成的摘要的质量,研究人员首先需要人工生成的黄金标准摘要。不幸的是,目前没有可用的数据来评估帮助健康信息消费者回答问题的摘要。为了解决这个问题,我们提出了 MEDIQA-Answer Summarization 数据集,这是第一个专门为面向问题、面向消费者的摘要而设计的数据集。它包含 156 个由消费者提出的健康问题、这些问题的答案以及这些答案的手动生成的摘要。该数据集的独特结构使其至少可以用于八种不同类型的摘要评估。我们还在数据集上对基准和最先进的深度学习方法的性能进行了基准测试,展示了如何使用它来评估自动生成的摘要。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/77c2/7532186/50bbd699f8b7/41597_2020_667_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验