• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

相似文献

1
Self-Repetition in Abstractive Neural Summarizers.抽象神经摘要器中的自我重复。
Proc Conf Assoc Comput Linguist Meet. 2022 Nov;2022:341-350.
2
Flight of the PEGASUS? Comparing Transformers on Few-Shot and Zero-Shot Multi-document Abstractive Summarization.飞马座的飞行?少样本和零样本多文档摘要生成任务中Transformer模型的比较
Proc Int Conf Comput Ling. 2020 Dec;2020:5640-5646.
3
Development and Evaluation of a Digital Scribe: Conversation Summarization Pipeline for Emergency Department Counseling Sessions towards Reducing Documentation Burden.数字书记员的开发与评估:用于急诊科咨询会话的对话摘要流程以减轻文档负担
medRxiv. 2023 Dec 7:2023.12.06.23299573. doi: 10.1101/2023.12.06.23299573.
4
Evaluation of a Digital Scribe: Conversation Summarization for Emergency Department Consultation Calls.数字抄写员的评估:急诊科会诊电话的对话总结
Appl Clin Inform. 2024 May 15;15(3):600-11. doi: 10.1055/a-2327-4121.
5
Psychosocial interventions for self-harm in adults.成人自伤的心理社会干预。
Cochrane Database Syst Rev. 2021 Apr 22;4(4):CD013668. doi: 10.1002/14651858.CD013668.pub2.
6
Dual Encoding for Abstractive Text Summarization.双重编码用于抽象文本摘要。
IEEE Trans Cybern. 2020 Mar;50(3):985-996. doi: 10.1109/TCYB.2018.2876317. Epub 2018 Nov 2.
7
Knowledge-Infused Abstractive Summarization of Clinical Diagnostic Interviews: Framework Development Study.临床诊断访谈的知识注入式摘要生成:框架开发研究
JMIR Ment Health. 2021 May 10;8(5):e20865. doi: 10.2196/20865.
8
Neurocognitive basis of repetition deficits in primary progressive aphasia.原发性进行性失语症中重复缺陷的神经认知基础。
Brain Lang. 2019 Jul;194:35-45. doi: 10.1016/j.bandl.2019.04.003. Epub 2019 May 2.
9
Abstractive text summarization of low-resourced languages using deep learning.使用深度学习对低资源语言进行摘要性文本总结。
PeerJ Comput Sci. 2023 Jan 13;9:e1176. doi: 10.7717/peerj-cs.1176. eCollection 2023.
10
Folic acid supplementation and malaria susceptibility and severity among people taking antifolate antimalarial drugs in endemic areas.在流行地区,服用抗叶酸抗疟药物的人群中,叶酸补充剂与疟疾易感性和严重程度的关系。
Cochrane Database Syst Rev. 2022 Feb 1;2(2022):CD014217. doi: 10.1002/14651858.CD014217.

引用本文的文献

1
Automated Metrics for Medical Multi-Document Summarization Disagree with Human Evaluations.医学多文档摘要的自动化指标与人类评估结果不一致。
Proc Conf Assoc Comput Linguist Meet. 2023 Jul;2023:9871-9889. doi: 10.18653/v1/2023.acl-long.549.

本文引用的文献

1
Generating (Factual?) Narrative Summaries of RCTs: Experiments with Neural Multi-Document Summarization.生成(真实的?)RCT 叙述性摘要:神经多文档摘要实验。
AMIA Jt Summits Transl Sci Proc. 2021 May 17;2021:605-614. eCollection 2021.
2
Hierarchical Human-Like Deep Neural Networks for Abstractive Text Summarization.分层类人深度神经网络在摘要文本生成中的应用。
IEEE Trans Neural Netw Learn Syst. 2021 Jun;32(6):2744-2757. doi: 10.1109/TNNLS.2020.3008037. Epub 2021 Jun 2.

抽象神经摘要器中的自我重复。

Self-Repetition in Abstractive Neural Summarizers.

作者信息

Salkar Nikita, Trikalinos Thomas, Wallace Byron C, Nenkova Ani

机构信息

Khoury College of Computer Sciences, Northeastern University, USA.

Health Services, Policy and Practice, Brown University, USA.

出版信息

Proc Conf Assoc Comput Linguist Meet. 2022 Nov;2022:341-350.

PMID:37484061
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10361333/
Abstract

We provide a quantitative and qualitative analysis of self-repetition in the output of neural summarizers. We measure self-repetition as the number of -grams of length four or longer that appear in multiple outputs of the same system. We analyze the behavior of three popular architectures (BART, T5 and Pegasus), fine-tuned on five datasets. In a regression analysis, we find that the three architectures have different propensities for repeating content across output summaries for inputs, with BART being particularly prone to self-repetition. Fine-tuning on more abstractive data, and on data featuring formulaic language, is associated with a higher rate of self-repetition. In qualitative analysis we find systems produce artefacts such as ads and disclaimers unrelated to the content being summarized, as well as formulaic phrases common in the fine-tuning domain. Our approach to corpus level analysis of self-repetition may help practitioners clean up training data for summarizers and ultimately support methods for minimizing the amount of self-repetition.

摘要

我们对神经摘要生成器输出中的自我重复进行了定量和定性分析。我们将自我重复衡量为在同一系统的多个输出中出现的长度为四个或更长的 - 词元数量。我们分析了在五个数据集上进行微调的三种流行架构(BART、T5 和 Pegasus)的行为。在回归分析中,我们发现这三种架构在跨输入的输出摘要中重复内容的倾向不同,其中 BART 特别容易出现自我重复。在更抽象的数据以及具有公式化语言的数据上进行微调与更高的自我重复率相关。在定性分析中,我们发现系统会产生与被总结内容无关的人工制品,如广告和免责声明,以及微调领域中常见的公式化短语。我们对自我重复进行语料库级分析的方法可能有助于从业者清理摘要生成器的训练数据,并最终支持将自我重复量降至最低的方法。