• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于文献的动态摘要。

Dynamic summarization of bibliographic-based data.

机构信息

Department of Biomedical Informatics, University of Utah, HSEB 5775, Salt Lake City, UT, USA.

出版信息

BMC Med Inform Decis Mak. 2011 Feb 1;11:6. doi: 10.1186/1472-6947-11-6.

DOI:10.1186/1472-6947-11-6
PMID:21284871
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3042900/
Abstract

BACKGROUND

Traditional information retrieval techniques typically return excessive output when directed at large bibliographic databases. Natural Language Processing applications strive to extract salient content from the excessive data. Semantic MEDLINE, a National Library of Medicine (NLM) natural language processing application, highlights relevant information in PubMed data. However, Semantic MEDLINE implements manually coded schemas, accommodating few information needs. Currently, there are only five such schemas, while many more would be needed to realistically accommodate all potential users. The aim of this project was to develop and evaluate a statistical algorithm that automatically identifies relevant bibliographic data; the new algorithm could be incorporated into a dynamic schema to accommodate various information needs in Semantic MEDLINE, and eliminate the need for multiple schemas.

METHODS

We developed a flexible algorithm named Combo that combines three statistical metrics, the Kullback-Leibler Divergence (KLD), Riloff's RlogF metric (RlogF), and a new metric called PredScal, to automatically identify salient data in bibliographic text. We downloaded citations from a PubMed search query addressing the genetic etiology of bladder cancer. The citations were processed with SemRep, an NLM rule-based application that produces semantic predications. SemRep output was processed by Combo, in addition to the standard Semantic MEDLINE genetics schema and independently by the two individual KLD and RlogF metrics. We evaluated each summarization method using an existing reference standard within the task-based context of genetic database curation.

RESULTS

Combo asserted 74 genetic entities implicated in bladder cancer development, whereas the traditional schema asserted 10 genetic entities; the KLD and RlogF metrics individually asserted 77 and 69 genetic entities, respectively. Combo achieved 61% recall and 81% precision, with an F-score of 0.69. The traditional schema achieved 23% recall and 100% precision, with an F-score of 0.37. The KLD metric achieved 61% recall, 70% precision, with an F-score of 0.65. The RlogF metric achieved 61% recall, 72% precision, with an F-score of 0.66.

CONCLUSIONS

Semantic MEDLINE summarization using the new Combo algorithm outperformed a conventional summarization schema in a genetic database curation task. It potentially could streamline information acquisition for other needs without having to hand-build multiple saliency schemas.

摘要

背景

传统的信息检索技术在针对大型书目数据库时通常会返回过多的输出。自然语言处理应用程序致力于从过多的数据中提取突出的内容。语义 MEDLINE 是美国国家医学图书馆 (NLM) 的自然语言处理应用程序,它突出显示 PubMed 数据中的相关信息。然而,语义 MEDLINE 实现了手动编码的模式,仅能满足少数信息需求。目前,只有五个这样的模式,而要真正满足所有潜在用户的需求,则需要更多的模式。本项目的目的是开发和评估一种自动识别相关书目数据的统计算法;新算法可以合并到动态模式中,以满足语义 MEDLINE 中的各种信息需求,并消除对多个模式的需求。

方法

我们开发了一种名为 Combo 的灵活算法,该算法结合了三个统计度量标准,即 Kullback-Leibler 散度 (KLD)、Riloff 的 RlogF 度量 (RlogF) 和一个新的度量标准 PredScal,以自动识别书目文本中的突出数据。我们从一个针对膀胱癌遗传病因的 PubMed 搜索查询中下载了引文。引文经过 NLM 基于规则的 SemRep 应用程序处理,该应用程序生成语义预测。除了标准的 Semantic MEDLINE 遗传学模式外,Combo 还处理 SemRep 输出,并分别由两个单独的 KLD 和 RlogF 度量处理。我们在遗传数据库管理的基于任务的上下文中使用现有的参考标准来评估每种摘要方法。

结果

Combo 断言了 74 个与膀胱癌发展有关的遗传实体,而传统模式断言了 10 个遗传实体;KLD 和 RlogF 度量分别断言了 77 个和 69 个遗传实体。Combo 实现了 61%的召回率和 81%的精度,F1 得分为 0.69。传统模式实现了 23%的召回率和 100%的精度,F1 得分为 0.37。KLD 度量实现了 61%的召回率、70%的精度,F1 得分为 0.65。RlogF 度量实现了 61%的召回率、72%的精度,F1 得分为 0.66。

结论

在遗传数据库管理任务中,使用新的 Combo 算法进行语义 MEDLINE 摘要优于传统摘要模式。它可能无需手动构建多个显著性模式,即可简化其他需求的信息获取。

相似文献

1
Dynamic summarization of bibliographic-based data.基于文献的动态摘要。
BMC Med Inform Decis Mak. 2011 Feb 1;11:6. doi: 10.1186/1472-6947-11-6.
2
Text summarization as a decision support aid.文本摘要是一种决策支持辅助工具。
BMC Med Inform Decis Mak. 2012 May 23;12:41. doi: 10.1186/1472-6947-12-41.
3
Biomedical text summarization to support genetic database curation: using Semantic MEDLINE to create a secondary database of genetic information.生物医学文本摘要支持遗传数据库管理:使用语义 MEDLINE 创建遗传信息二级数据库。
J Med Libr Assoc. 2010 Oct;98(4):273-81. doi: 10.3163/1536-5050.98.4.003.
4
Rethinking information delivery: using a natural language processing application for point-of-care data discovery.重新思考信息传递:利用自然语言处理应用程序进行即时数据发现。
J Med Libr Assoc. 2012 Apr;100(2):113-20. doi: 10.3163/1536-5050.100.2.009.
5
Disease Related Knowledge Summarization Based on Deep Graph Search.基于深度图搜索的疾病相关知识总结
Biomed Res Int. 2015;2015:428195. doi: 10.1155/2015/428195. Epub 2015 Aug 25.
6
Identification of the Best Semantic Expansion to Query PubMed Through Automatic Performance Assessment of Four Search Strategies on All Medical Subject Heading Descriptors: Comparative Study.通过对所有医学主题词描述符的四种检索策略进行自动性能评估来确定查询PubMed的最佳语义扩展:比较研究
JMIR Med Inform. 2020 Jun 4;8(6):e12799. doi: 10.2196/12799.
7
Degree centrality for semantic abstraction summarization of therapeutic studies.治疗研究语义抽象总结的度中心性。
J Biomed Inform. 2011 Oct;44(5):830-8. doi: 10.1016/j.jbi.2011.05.001. Epub 2011 May 8.
8
Classification of clinically useful sentences in clinical evidence resources.临床证据资源中临床有用句子的分类。
J Biomed Inform. 2016 Apr;60:14-22. doi: 10.1016/j.jbi.2016.01.003. Epub 2016 Jan 13.
9
Enhancing biomedical text summarization using semantic relation extraction.利用语义关系抽取技术增强生物医学文本摘要
PLoS One. 2011;6(8):e23862. doi: 10.1371/journal.pone.0023862. Epub 2011 Aug 26.
10
Clinical Context-Aware Biomedical Text Summarization Using Deep Neural Network: Model Development and Validation.基于深度神经网络的临床相关生物医学文本摘要:模型开发与验证。
J Med Internet Res. 2020 Oct 23;22(10):e19810. doi: 10.2196/19810.

引用本文的文献

1
Disease Related Knowledge Summarization Based on Deep Graph Search.基于深度图搜索的疾病相关知识总结
Biomed Res Int. 2015;2015:428195. doi: 10.1155/2015/428195. Epub 2015 Aug 25.
2
Figure-associated text summarization and evaluation.与图相关的文本总结与评估。
PLoS One. 2015 Feb 2;10(2):e0115671. doi: 10.1371/journal.pone.0115671. eCollection 2015.
3
Text summarization as a decision support aid.文本摘要是一种决策支持辅助工具。

本文引用的文献

1
Boolean versus ranked querying for biomedical systematic reviews.布尔查询与等级查询在生物医学系统评价中的比较。
BMC Med Inform Decis Mak. 2010 Oct 12;10:58. doi: 10.1186/1472-6947-10-58.
2
Biomedical text summarization to support genetic database curation: using Semantic MEDLINE to create a secondary database of genetic information.生物医学文本摘要支持遗传数据库管理:使用语义 MEDLINE 创建遗传信息二级数据库。
J Med Libr Assoc. 2010 Oct;98(4):273-81. doi: 10.3163/1536-5050.98.4.003.
3
Towards automating the initial screening phase of a systematic review.
BMC Med Inform Decis Mak. 2012 May 23;12:41. doi: 10.1186/1472-6947-12-41.
4
Rethinking information delivery: using a natural language processing application for point-of-care data discovery.重新思考信息传递:利用自然语言处理应用程序进行即时数据发现。
J Med Libr Assoc. 2012 Apr;100(2):113-20. doi: 10.3163/1536-5050.100.2.009.
5
Enhancing biomedical text summarization using semantic relation extraction.利用语义关系抽取技术增强生物医学文本摘要
PLoS One. 2011;6(8):e23862. doi: 10.1371/journal.pone.0023862. Epub 2011 Aug 26.
迈向系统评价初始筛选阶段的自动化
Stud Health Technol Inform. 2010;160(Pt 1):146-50.
4
Automatic summarization of MEDLINE citations for evidence-based medical treatment: a topic-oriented evaluation.基于证据的医学治疗的 MEDLINE 引文自动摘要:面向主题的评估。
J Biomed Inform. 2009 Oct;42(5):801-13. doi: 10.1016/j.jbi.2008.10.002. Epub 2008 Nov 5.
5
Extracting semantic predications from Medline citations for pharmacogenomics.从医学文献数据库(Medline)引用中提取药物基因组学的语义谓词。
Pac Symp Biocomput. 2007:209-20.
6
Dietary advice for reducing cardiovascular risk.降低心血管疾病风险的饮食建议。
Cochrane Database Syst Rev. 2007 Oct 17(4):CD002128. doi: 10.1002/14651858.CD002128.pub3.
7
The type 1 insulin-like growth factor receptor is over-expressed in bladder cancer.1型胰岛素样生长因子受体在膀胱癌中过度表达。
BJU Int. 2007 Dec;100(6):1396-401. doi: 10.1111/j.1464-410X.2007.06931.x. Epub 2007 Jul 23.
8
Developing search strategies for clinical practice guidelines in SUMSearch and Google Scholar and assessing their retrieval performance.在SUMSearch和谷歌学术中为临床实践指南制定检索策略并评估其检索性能。
BMC Med Res Methodol. 2007 Jun 30;7:28. doi: 10.1186/1471-2288-7-28.
9
A document clustering and ranking system for exploring MEDLINE citations.一种用于探索MEDLINE引文的文档聚类和排序系统。
J Am Med Inform Assoc. 2007 Sep-Oct;14(5):651-61. doi: 10.1197/jamia.M2215. Epub 2007 Jun 28.
10
Semantic processing to enhance retrieval of diagnosis citations from Medline.语义处理以增强从医学文献数据库(Medline)中检索诊断引用文献的能力。
AMIA Annu Symp Proc. 2006;2006:1104.