Suppr超能文献

基于进化算法的集成抽取式摘要在开发智能医疗系统中的应用。

Evolutionary Algorithm based Ensemble Extractive Summarization for Developing Smart Medical System.

机构信息

Department of Computer Science and Technology, Indian Institute of Engineering Science and Technology, Shibpur, Howrah, 711103, India.

Department of Computer Science and Engineering, Aditya Institute of Technology and Management (AITAM), Tekkali, Andhra Pradesh, 532201, India.

出版信息

Interdiscip Sci. 2021 Jun;13(2):229-259. doi: 10.1007/s12539-020-00412-5. Epub 2021 Feb 12.

Abstract

The amount of information in the scientific literature of the bio-medical domain is growing exponentially, which makes it difficult in developing a smart medical system. Summarization techniques help for efficient searching and understanding of relevant information from the medical documents. In the paper, an evolutionary algorithm based ensemble extractive summarization technique is devised as a smart medical application with the idea of hybrid artificial intelligence on natural language processing. We have considered the abstracts of the target article and its cited articles as the base summaries and a multi-objective evolutionary algorithm is applied for generating the ensemble summary of the target article. Each sentence of the base summaries is represented by a concept vector of the medical terms contained in it with the help of the Unified Modelling Language System (UMLS) tool which is widely used in various smart medical applications. These terms carry the key information of the sentence which is very useful to find out the semantic similarity among the sentences. Fitness functions of the evolutionary algorithm are mainly defined using clustering coefficient and sparsity index, the concepts of graph theory. After the convergence of the algorithm, the best solution of the final population gives the ensemble summary. Next, the semantic similarity of each sentence in the target article with the ensemble summary is calculated and the sentences which are most similar to the ensemble summary are considered as the summary of the target article. The method is applied to the articles available in the PubMed MEDLINE database system and experimental results are compared with some state of the art methods applied in the Bio-medical domain. Experimental results and comparative study based on the performance evaluation show that the method competes with some recently proposed summarization methods and outperforms others, which express the effectiveness of the proposed methodology. Different statistical tests have also been made to observe that the method is statistically significant.

摘要

生物医学领域的科学文献信息量呈指数级增长,这使得开发智能医疗系统变得困难。摘要技术有助于从医学文献中高效搜索和理解相关信息。在本文中,设计了一种基于进化算法的集成抽取式摘要技术,作为一种具有混合人工智能的自然语言处理智能医疗应用。我们考虑了目标文章的摘要及其引用文章作为基础摘要,并应用多目标进化算法为目标文章生成集成摘要。基础摘要中的每个句子都由包含在其中的术语的概念向量表示,这得益于广泛应用于各种智能医疗应用的统一建模语言系统(UMLS)工具。这些术语携带了句子的关键信息,对于找出句子之间的语义相似性非常有用。进化算法的适应度函数主要使用聚类系数和稀疏度指数来定义,这是图论的概念。在算法收敛后,最终种群的最佳解给出了集成摘要。接下来,计算目标文章中每个句子与集成摘要的语义相似度,并将与集成摘要最相似的句子视为目标文章的摘要。该方法应用于 PubMed MEDLINE 数据库系统中的文章,并与生物医学领域中应用的一些最新方法进行了实验比较。基于性能评估的实验结果和比较研究表明,该方法与最近提出的一些摘要方法具有竞争力,并且优于其他方法,这表明了所提出方法的有效性。还进行了不同的统计检验来观察该方法在统计学上是显著的。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验