Suppr超能文献

从医学文献数据库中自动提取首字母缩略词及其含义对。

Automatic extraction of acronym-meaning pairs from MEDLINE databases.

作者信息

Pustejovsky J, Castaño J, Cochran B, Kotecki M, Morrell M

机构信息

Laboratory for Linguistics and Computation at Brandeis University, Waltham, MA, USA.

出版信息

Stud Health Technol Inform. 2001;84(Pt 1):371-5.

Abstract

Acronyms are widely used in biomedical and other technical texts. Understanding their meaning constitutes an important problem in the automatic extraction and mining of information from text. Here we present a system called ACROMED that is part of a set of Information Extraction tools designed for processing and extracting information from abstracts in the Medline database. In this paper, we present the results of two strategies for finding the long forms for acronyms in biomedical texts. These strategies differ from previous automated acronym extraction methods by being tuned to the complex phrase structures of the biomedical lexicon and by incorporating shallow parsing of the text into the acronym recognition algorithm. The performance of our system was tested with several data sets obtaining a performance of 72 % recall with 97 % precision. These results are found to be better for biomedical texts than the performance of other acronym extraction systems designed for unrestricted text.

摘要

首字母缩略词在生物医学和其他科技文本中广泛使用。理解它们的含义是从文本中自动提取和挖掘信息的一个重要问题。在此,我们展示一个名为ACROMED的系统,它是为处理和从Medline数据库的摘要中提取信息而设计的一组信息提取工具的一部分。在本文中,我们展示了两种在生物医学文本中查找首字母缩略词全称的策略的结果。这些策略与之前的自动首字母缩略词提取方法不同,它们针对生物医学词汇的复杂短语结构进行了调整,并将文本的浅层解析纳入首字母缩略词识别算法。我们的系统性能通过几个数据集进行了测试,召回率达到72%,精确率达到97%。对于生物医学文本,这些结果比为无限制文本设计的其他首字母缩略词提取系统的性能要好。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验