• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

SATS:科学文献的简化感知文本摘要

SATS: simplification aware text summarization of scientific documents.

作者信息

Zaman Farooq, Kamiran Faisal, Shardlow Matthew, Hassan Saeed-Ul, Karim Asim, Aljohani Naif Radi

机构信息

Scientometrics Lab, Information Technology University, Lahore, Pakistan.

Department of Computing and Mathematics, Manchester Metropolitan University, Manchester, United Kingdom.

出版信息

Front Artif Intell. 2024 Jul 10;7:1375419. doi: 10.3389/frai.2024.1375419. eCollection 2024.

DOI:10.3389/frai.2024.1375419
PMID:39049961
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11266102/
Abstract

Simplifying summaries of scholarly publications has been a popular method for conveying scientific discoveries to a broader audience. While text summarization aims to shorten long documents, simplification seeks to reduce the complexity of a document. To accomplish these tasks collectively, there is a need to develop machine learning methods to shorten and simplify longer texts. This study presents a new Simplification Aware Text Summarization model (SATS) based on future n-gram prediction. The proposed SATS model extends ProphetNet, a text summarization model, by enhancing the objective function using a word frequency lexicon for simplification tasks. We have evaluated the performance of SATS on a recently published text summarization and simplification corpus consisting of 5,400 scientific article pairs. Our results in terms of automatic evaluation demonstrate that SATS outperforms state-of-the-art models for simplification, summarization, and joint simplification-summarization across two datasets on ROUGE, SARI, and . We also provide human evaluation of summaries generated by the SATS model. We evaluated 100 summaries from eight annotators for grammar, coherence, consistency, fluency, and simplicity. The average human judgment for all evaluated dimensions lies between 4.0 and 4.5 on a scale from 1 to 5 where 1 means low and 5 means high.

摘要

简化学术出版物的摘要一直是向更广泛受众传达科学发现的常用方法。虽然文本摘要旨在缩短长篇文档,但简化则力求降低文档的复杂性。为了共同完成这些任务,需要开发机器学习方法来缩短和简化较长的文本。本研究提出了一种基于未来n元语法预测的新型简化感知文本摘要模型(SATS)。所提出的SATS模型通过使用词频词典增强目标函数以用于简化任务,对文本摘要模型ProphetNet进行了扩展。我们在一个最近发布的由5400对科学文章组成的文本摘要和简化语料库上评估了SATS的性能。我们在自动评估方面的结果表明,在两个数据集上,就ROUGE、SARI和 而言,SATS在简化、摘要以及联合简化-摘要方面均优于现有最先进的模型。我们还对SATS模型生成的摘要进行了人工评估。我们让八位注释者对100篇摘要的语法、连贯性、一致性、流畅性和简洁性进行了评估。在从1到5的评分量表上(1表示低,5表示高),所有评估维度的平均人工判断介于4.0和4.5之间。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5ab5/11266102/025faaef0685/frai-07-1375419-g0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5ab5/11266102/c5b715e29f0a/frai-07-1375419-g0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5ab5/11266102/180072f98480/frai-07-1375419-g0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5ab5/11266102/e6e1e62fa418/frai-07-1375419-g0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5ab5/11266102/025faaef0685/frai-07-1375419-g0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5ab5/11266102/c5b715e29f0a/frai-07-1375419-g0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5ab5/11266102/180072f98480/frai-07-1375419-g0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5ab5/11266102/e6e1e62fa418/frai-07-1375419-g0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5ab5/11266102/025faaef0685/frai-07-1375419-g0004.jpg

相似文献

1
SATS: simplification aware text summarization of scientific documents.SATS:科学文献的简化感知文本摘要
Front Artif Intell. 2024 Jul 10;7:1375419. doi: 10.3389/frai.2024.1375419. eCollection 2024.
2
Medical Text Simplification Using Reinforcement Learning (TESLEA): Deep Learning-Based Text Simplification Approach.使用强化学习的医学文本简化(TESLEA):基于深度学习的文本简化方法。
JMIR Med Inform. 2022 Nov 18;10(11):e38095. doi: 10.2196/38095.
3
CERC: an interactive content extraction, recognition, and construction tool for clinical and biomedical text.CERC:一个用于临床和生物医学文本的交互式内容提取、识别和构建工具。
BMC Med Inform Decis Mak. 2020 Dec 15;20(Suppl 14):306. doi: 10.1186/s12911-020-01330-8.
4
CovSumm: an unsupervised transformer-cum-graph-based hybrid document summarization model for CORD-19.CovSumm:一种用于CORD-19的基于无监督Transformer和图的混合文档摘要模型。
J Supercomput. 2023 Apr 26:1-23. doi: 10.1007/s11227-023-05291-3.
5
Enhancing Persian text summarization through a three-phase fine-tuning and reinforcement learning approach with the mT5 transformer model.通过使用mT5变压器模型的三阶段微调与强化学习方法来增强波斯语文本摘要。
Sci Rep. 2025 Jan 2;15(1):80. doi: 10.1038/s41598-024-78235-3.
6
Knowledge-enhanced Graph Topic Transformer for Explainable Biomedical Text Summarization.用于可解释生物医学文本摘要的知识增强图主题变换器
IEEE J Biomed Health Inform. 2023 Aug 23;PP. doi: 10.1109/JBHI.2023.3308064.
7
Single document text summarization addressed with a cat swarm optimization approach.基于猫群优化算法的单文档文本摘要
Appl Intell (Dordr). 2023;53(10):12268-12287. doi: 10.1007/s10489-022-04149-0. Epub 2022 Sep 24.
8
A Comprehensive Survey of Abstractive Text Summarization Based on Deep Learning.基于深度学习的抽象文本摘要综述
Comput Intell Neurosci. 2022 Aug 1;2022:7132226. doi: 10.1155/2022/7132226. eCollection 2022.
9
Quantifying the informativeness for biomedical literature summarization: An itemset mining method.量化生物医学文献摘要的信息量:一种基于项集挖掘的方法。
Comput Methods Programs Biomed. 2017 Jul;146:77-89. doi: 10.1016/j.cmpb.2017.05.011. Epub 2017 May 27.
10
Extractive summarization of clinical trial descriptions.临床试验描述的抽取式总结。
Int J Med Inform. 2019 Sep;129:114-121. doi: 10.1016/j.ijmedinf.2019.05.019. Epub 2019 May 30.

引用本文的文献

1
Dynamic taxonomy generation for future skills identification using a named entity recognition and relation extraction pipeline.使用命名实体识别和关系提取管道生成动态分类法以识别未来技能。
Front Artif Intell. 2025 Jul 2;8:1579998. doi: 10.3389/frai.2025.1579998. eCollection 2025.

本文引用的文献

1
Ascle-A Python Natural Language Processing Toolkit for Medical Text Generation: Development and Evaluation Study.Ascle-A 是一个用于医疗文本生成的 Python 自然语言处理工具包:开发和评估研究。
J Med Internet Res. 2024 Oct 3;26:e60601. doi: 10.2196/60601.
2
Adapted large language models can outperform medical experts in clinical text summarization.经过改编的大型语言模型在临床文本总结方面的表现优于医学专家。
Nat Med. 2024 Apr;30(4):1134-1142. doi: 10.1038/s41591-024-02855-5. Epub 2024 Feb 27.
3
Identification of research hypotheses and new knowledge from scientific literature.
从科学文献中识别研究假设和新知识。
BMC Med Inform Decis Mak. 2018 Jun 25;18(1):46. doi: 10.1186/s12911-018-0639-1.