Suppr超能文献

ALICE:一种从医学文献数据库(MEDLINE)中提取缩写词的算法。

ALICE: an algorithm to extract abbreviations from MEDLINE.

作者信息

Ao Hiroko, Takagi Toshihisa

机构信息

Department of Computational Biology, University of Tokyo CB01, 5-1-5, Kashiwanoha, Kashiwa-shi, Chiba, 277-8561, Japan.

出版信息

J Am Med Inform Assoc. 2005 Sep-Oct;12(5):576-86. doi: 10.1197/jamia.M1757. Epub 2005 May 19.

Abstract

OBJECTIVE

To help biomedical researchers recognize dynamically introduced abbreviations in biomedical literature, such as gene and protein names, we have constructed a support system called ALICE (Abbreviation LIfter using Corpus-based Extraction). ALICE aims to extract all types of abbreviations with their expansions from a target paper on the fly.

METHODS

ALICE extracts an abbreviation and its expansion from the literature by using heuristic pattern-matching rules. This system consists of three phases and potentially identifies valid 320 abbreviation-expansion patterns as combinations of the rules.

RESULTS

It achieved 95% recall and 97% precision on randomly selected titles and abstracts from the MEDLINE database.

CONCLUSION

ALICE extracted abbreviations and their expansions from the literature efficiently. The subtly compiled heuristics enabled it to extract abbreviations with high recall without significantly reducing precision. ALICE does not only facilitate recognition of an undefined abbreviation in a paper by constructing an abbreviation database or dictionary, but also makes biomedical literature retrieval more accurate. This system is freely available at http://uvdb3.hgc.jp/ALICE/ALICE_index.html.

摘要

目的

为帮助生物医学研究人员识别生物医学文献中动态引入的缩写,如基因和蛋白质名称,我们构建了一个名为ALICE(基于语料库提取的缩写提升器)的支持系统。ALICE旨在即时从目标论文中提取各类缩写及其全称。

方法

ALICE通过使用启发式模式匹配规则从文献中提取缩写及其全称。该系统由三个阶段组成,作为规则的组合,可能识别出320种有效的缩写-全称模式。

结果

在从MEDLINE数据库中随机选择的标题和摘要上,它实现了95%的召回率和97%的精确率。

结论

ALICE能有效地从文献中提取缩写及其全称。精心编制的启发式方法使其能够在不显著降低精确率的情况下,以高召回率提取缩写。ALICE不仅通过构建缩写数据库或词典方便识别论文中未定义的缩写,还能使生物医学文献检索更准确。该系统可从http://uvdb3.hgc.jp/ALICE/ALICE_index.html免费获取。

相似文献

1
3
Resolving abbreviations to their senses in Medline.在医学文献数据库(Medline)中解析缩写词的含义。
Bioinformatics. 2005 Sep 15;21(18):3658-64. doi: 10.1093/bioinformatics/bti586. Epub 2005 Jul 21.
7
ADAM: another database of abbreviations in MEDLINE.ADAM:医学在线数据库(MEDLINE)中的另一个缩写词数据库。
Bioinformatics. 2006 Nov 15;22(22):2813-8. doi: 10.1093/bioinformatics/btl480. Epub 2006 Sep 18.
9
Building an abbreviation dictionary using a term recognition approach.使用术语识别方法构建缩写词典。
Bioinformatics. 2006 Dec 15;22(24):3089-95. doi: 10.1093/bioinformatics/btl534. Epub 2006 Oct 18.

引用本文的文献

1
Improved biomedical word embeddings in the transformer era.Transformer 时代改进的生物医学词向量。
J Biomed Inform. 2021 Aug;120:103867. doi: 10.1016/j.jbi.2021.103867. Epub 2021 Jul 18.
7
TPX: Biomedical literature search made easy.TPX:轻松进行生物医学文献检索。
Bioinformation. 2012;8(12):578-80. doi: 10.6026/97320630008578. Epub 2012 Jun 28.
9
Allie: a database and a search service of abbreviations and long forms.Allie:缩写和全称数据库及检索服务。
Database (Oxford). 2011 Apr 15;2011:bar013. doi: 10.1093/database/bar013. Print 2011.

本文引用的文献

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验