文献检索文档翻译深度研究
Suppr Zotero 插件Zotero 插件
邀请有礼套餐&价格历史记录

新学期,新优惠

限时优惠:9月1日-9月22日

30天高级会员仅需29元

1天体验卡首发特惠仅需5.99元

了解详情
不再提醒
插件&应用
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
高级版
套餐订阅购买积分包
AI 工具
文献检索文档翻译深度研究
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2025

用于从系统综述开发的全文中辅助数据提取的抽取式文本摘要系统。

Extractive text summarization system to aid data extraction from full text in systematic review development.

机构信息

Department of Biomedical Informatics, University of Utah, Salt Lake City, UT, USA; Division of Health and Biomedical Informatics, Northwestern University, Chicago, IL, USA.

Department of Biomedical Informatics, University of Utah, Salt Lake City, UT, USA.

出版信息

J Biomed Inform. 2016 Dec;64:265-272. doi: 10.1016/j.jbi.2016.10.014. Epub 2016 Oct 27.


DOI:10.1016/j.jbi.2016.10.014
PMID:27989816
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5362293/
Abstract

OBJECTIVES: Extracting data from publication reports is a standard process in systematic review (SR) development. However, the data extraction process still relies too much on manual effort which is slow, costly, and subject to human error. In this study, we developed a text summarization system aimed at enhancing productivity and reducing errors in the traditional data extraction process. METHODS: We developed a computer system that used machine learning and natural language processing approaches to automatically generate summaries of full-text scientific publications. The summaries at the sentence and fragment levels were evaluated in finding common clinical SR data elements such as sample size, group size, and PICO values. We compared the computer-generated summaries with human written summaries (title and abstract) in terms of the presence of necessary information for the data extraction as presented in the Cochrane review's study characteristics tables. RESULTS: At the sentence level, the computer-generated summaries covered more information than humans do for systematic reviews (recall 91.2% vs. 83.8%, p<0.001). They also had a better density of relevant sentences (precision 59% vs. 39%, p<0.001). At the fragment level, the ensemble approach combining rule-based, concept mapping, and dictionary-based methods performed better than individual methods alone, achieving an 84.7% F-measure. CONCLUSION: Computer-generated summaries are potential alternative information sources for data extraction in systematic review development. Machine learning and natural language processing are promising approaches to the development of such an extractive summarization system.

摘要

目的:从文献报告中提取数据是系统评价(SR)开发的标准流程。然而,数据提取过程仍然过于依赖人工,既缓慢、昂贵,又容易出错。在本研究中,我们开发了一种文本摘要系统,旨在提高传统数据提取过程的效率并减少错误。

方法:我们开发了一个计算机系统,该系统使用机器学习和自然语言处理方法自动生成全文科学出版物的摘要。在句子和片段级别评估摘要,以找到常见的临床 SR 数据元素,如样本量、组大小和 PICO 值。我们比较了计算机生成的摘要与人类编写的摘要(标题和摘要)在提取数据方面的信息完整性,这些信息在 Cochrane 综述的研究特征表中呈现。

结果:在句子级别上,计算机生成的摘要涵盖了比人类更全面的系统评价信息(召回率 91.2%比 83.8%,p<0.001)。它们还具有更高密度的相关句子(精度 59%比 39%,p<0.001)。在片段级别上,结合基于规则、概念映射和基于词典的方法的集成方法的表现优于单独使用的方法,F1 分数达到 84.7%。

结论:计算机生成的摘要可能是系统评价开发中数据提取的替代信息来源。机器学习和自然语言处理是开发这种提取式摘要系统的有前途的方法。

相似文献

[1]
Extractive text summarization system to aid data extraction from full text in systematic review development.

J Biomed Inform. 2016-12

[2]
CERC: an interactive content extraction, recognition, and construction tool for clinical and biomedical text.

BMC Med Inform Decis Mak. 2020-12-15

[3]
Folic acid supplementation and malaria susceptibility and severity among people taking antifolate antimalarial drugs in endemic areas.

Cochrane Database Syst Rev. 2022-2-1

[4]
PDF text classification to leverage information extraction from publication reports.

J Biomed Inform. 2016-6

[5]
Extractive single document summarization using binary differential evolution: Optimization of different sentence quality measures.

PLoS One. 2019-11-14

[6]
Text mining to support abstract screening for knowledge syntheses: a semi-automated workflow.

Syst Rev. 2021-5-26

[7]
A systematic review of automatic text summarization for biomedical literature and EHRs.

J Am Med Inform Assoc. 2021-9-18

[8]
Comparing generative and extractive approaches to information extraction from abstracts describing randomized clinical trials.

J Biomed Semantics. 2024-4-23

[9]
Extractive summarization of clinical trial descriptions.

Int J Med Inform. 2019-5-30

[10]
Extraction of temporal relations from clinical free text: A systematic review of current approaches.

J Biomed Inform. 2020-8

引用本文的文献

[1]
Enhanced transformer for length-controlled abstractive summarization based on summary output area.

PeerJ Comput Sci. 2025-3-11

[2]
Double burden of malnutrition among households in Ethiopia: a systematic review and meta-analysis.

Front Public Health. 2025-1-30

[3]
Efficient evidence selection for systematic reviews in traditional Chinese medicine.

BMC Med Res Methodol. 2025-1-15

[4]
Text summarization for pharmaceutical sciences using hierarchical clustering with a weighted evaluation methodology.

Sci Rep. 2024-8-30

[5]
Leveraging artificial intelligence to summarize abstracts in lay language for increasing research accessibility and transparency.

J Am Med Inform Assoc. 2024-10-1

[6]
Automation of systematic reviews of biomedical literature: a scoping review of studies indexed in PubMed.

Syst Rev. 2024-7-8

[7]
Retrieval augmentation of large language models for lay language generation.

J Biomed Inform. 2024-1

[8]
A Scoping Review of Adopted Information Extraction Methods for RCTs.

Med J Islam Repub Iran. 2023-9-4

[9]
A novel centroid based sentence classification approach for extractive summarization of COVID-19 news reports.

Int J Inf Technol. 2023

[10]
Applications of natural language processing in ophthalmology: present and future.

Front Med (Lausanne). 2022-8-8

本文引用的文献

[1]
Extracting PICO Sentences from Clinical Trial Reports using .

J Mach Learn Res. 2016

[2]
PDF text classification to leverage information extraction from publication reports.

J Biomed Inform. 2016-6

[3]
Automatically finding relevant citations for clinical guideline development.

J Biomed Inform. 2015-10

[4]
Automated methods for the summarization of electronic health records.

J Am Med Inform Assoc. 2015-9

[5]
Link-topic model for biomedical abbreviation disambiguation.

J Biomed Inform. 2015-2

[6]
Support Vector Feature Selection for Early Detection of Anastomosis Leakage From Bag-of-Words in Electronic Health Records.

IEEE J Biomed Health Inform. 2016-9

[7]
Text summarization in the biomedical domain: a systematic review of recent research.

J Biomed Inform. 2014-12

[8]
Learning regular expressions for clinical text classification.

J Am Med Inform Assoc. 2014-2-27

[9]
PICO element detection in medical text without metadata: are first sentences enough?

J Biomed Inform. 2013-7-27

[10]
Classification of diffuse lung disease patterns on high-resolution computed tomography by a bag of words approach.

Med Image Comput Comput Assist Interv. 2011

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

推荐工具

医学文档翻译智能文献检索