Suppr
超能文献

概率生物医学文本摘要中重要概念识别的不同方法。

Different approaches for identifying important concepts in probabilistic biomedical text summarization.

机构信息

Department of Electrical and Computer Engineering, Isfahan University of Technology, Isfahan 84156-83111, Iran.

出版信息

Artif Intell Med. 2018 Jan;84:101-116. doi: 10.1016/j.artmed.2017.11.004. Epub 2017 Dec 6.

DOI:10.1016/j.artmed.2017.11.004

PMID:29208328

Abstract

Automatic text summarization tools help users in the biomedical domain to acquire their intended information from various textual resources more efficiently. Some of biomedical text summarization systems put the basis of their sentence selection approach on the frequency of concepts extracted from the input text. However, it seems that exploring other measures rather than the raw frequency for identifying valuable contents within an input document, or considering correlations existing between concepts, may be more useful for this type of summarization. In this paper, we describe a Bayesian summarization method for biomedical text documents. The Bayesian summarizer initially maps the input text to the Unified Medical Language System (UMLS) concepts; then it selects the important ones to be used as classification features. We introduce six different feature selection approaches to identify the most important concepts of the text and select the most informative contents according to the distribution of these concepts. We show that with the use of an appropriate feature selection approach, the Bayesian summarizer can improve the performance of biomedical summarization. Using the Recall-Oriented Understudy for Gisting Evaluation (ROUGE) toolkit, we perform extensive evaluations on a corpus of scientific papers in the biomedical domain. The results show that when the Bayesian summarizer utilizes the feature selection methods that do not use the raw frequency, it can outperform the biomedical summarizers that rely on the frequency of concepts, domain-independent and baseline methods.

摘要

自动文本摘要工具帮助生物医学领域的用户更有效地从各种文本资源中获取所需的信息。一些生物医学文本摘要系统将其句子选择方法的基础建立在从输入文本中提取的概念的频率上。然而，对于这种类型的摘要，探索其他措施而不是原始频率来识别输入文档中的有价值内容，或者考虑概念之间存在的相关性，可能会更有用。在本文中，我们描述了一种用于生物医学文本文档的贝叶斯摘要方法。贝叶斯摘要器首先将输入文本映射到统一医学语言系统 (UMLS) 概念；然后选择重要的概念作为分类特征。我们介绍了六种不同的特征选择方法来识别文本中最重要的概念，并根据这些概念的分布选择最具信息量的内容。我们表明，通过使用适当的特征选择方法，贝叶斯摘要器可以提高生物医学摘要的性能。我们使用面向摘要评估的召回导向工具包 (ROUGE) 在生物医学领域的科学论文语料库上进行了广泛的评估。结果表明，当贝叶斯摘要器使用不使用原始频率的特征选择方法时，它可以胜过依赖概念频率的生物医学摘要器、独立于领域的方法和基线方法。

相似文献

Different approaches for identifying important concepts in probabilistic biomedical text summarization.

Artif Intell Med. 2018 Jan;84:101-116. doi: 10.1016/j.artmed.2017.11.004. Epub 2017 Dec 6.

Quantifying the informativeness for biomedical literature summarization: An itemset mining method.

Comput Methods Programs Biomed. 2017 Jul;146:77-89. doi: 10.1016/j.cmpb.2017.05.011. Epub 2017 May 27.

Graph-based biomedical text summarization: An itemset mining and sentence clustering approach.

J Biomed Inform. 2018 Aug;84:42-58. doi: 10.1016/j.jbi.2018.06.005. Epub 2018 Jun 15.

CIBS: A biomedical text summarizer using topic-based sentence clustering.

J Biomed Inform. 2018 Dec;88:53-61. doi: 10.1016/j.jbi.2018.11.006. Epub 2018 Nov 13.

Summarization of biomedical articles using domain-specific word embeddings and graph ranking.

J Biomed Inform. 2020 Jul;107:103452. doi: 10.1016/j.jbi.2020.103452. Epub 2020 May 19.

Deep contextualized embeddings for quantifying the informative content in biomedical text summarization.

Comput Methods Programs Biomed. 2020 Feb;184:105117. doi: 10.1016/j.cmpb.2019.105117. Epub 2019 Oct 4.

MultiGBS: A multi-layer graph approach to biomedical summarization.

J Biomed Inform. 2021 Apr;116:103706. doi: 10.1016/j.jbi.2021.103706. Epub 2021 Feb 18.

Biomedical semantic text summarizer.

BMC Bioinformatics. 2024 Apr 16;25(1):152. doi: 10.1186/s12859-024-05712-x.

Comparing different knowledge sources for the automatic summarization of biomedical literature.

J Biomed Inform. 2014 Dec;52:319-28. doi: 10.1016/j.jbi.2014.07.014. Epub 2014 Jul 24.

A semantic graph-based approach to biomedical summarisation.

Artif Intell Med. 2011 Sep;53(1):1-14. doi: 10.1016/j.artmed.2011.06.005. Epub 2011 Jul 12.

引用本文的文献

Biomedical semantic text summarizer.

BMC Bioinformatics. 2024 Apr 16;25(1):152. doi: 10.1186/s12859-024-05712-x.

Retrieval augmentation of large language models for lay language generation.

J Biomed Inform. 2024 Jan;149:104580. doi: 10.1016/j.jbi.2023.104580. Epub 2023 Dec 30.

Patient Information Summarization in Clinical Settings: Scoping Review.

JMIR Med Inform. 2023 Nov 28;11:e44639. doi: 10.2196/44639.

A systematic review of automatic text summarization for biomedical literature and EHRs.

J Am Med Inform Assoc. 2021 Sep 18;28(10):2287-2297. doi: 10.1093/jamia/ocab143.

Identify the Characteristics of Metabolic Syndrome and Non-obese Phenotype: Data Visualization and a Machine Learning Approach.

Front Med (Lausanne). 2021 Apr 7;8:626580. doi: 10.3389/fmed.2021.626580. eCollection 2021.

Predicting Metabolic Syndrome With Machine Learning Models Using a Decision Tree Algorithm: Retrospective Cohort Study.

JMIR Med Inform. 2020 Mar 23;8(3):e17110. doi: 10.2196/17110.

Exploiting Machine Learning Algorithms and Methods for the Prediction of Agitated Delirium After Cardiac Surgery: Models Development and Validation Study.

JMIR Med Inform. 2019 Oct 23;7(4):e14993. doi: 10.2196/14993.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

Suppr超能文献

概率生物医学文本摘要中重要概念识别的不同方法。

Different approaches for identifying important concepts in probabilistic biomedical text summarization.

机构信息

出版信息

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译