Suppr超能文献

生物医学文献中足月状态的有效分级。

Effective grading of termhood in biomedical literature.

作者信息

Wermter Joachim, Hahn Udo

机构信息

Jena University Language and Information Engineering (JULIE) Lab. http://www.coling.uni-jena.de

出版信息

AMIA Annu Symp Proc. 2005;2005:809-13.

Abstract

The ever-increasing amount of textual information in biomedicine calls for effective procedures for automatic terminology extraction which assist biomedical researchers and professionals in gathering and organizing terminological knowledge encoded in text documents. In this study, we propose a new, linguistically grounded measure for automatically identifying multi-word terms from the biomedical literature. Our approach is based on the limited paradigmatic modifiability of terms and is tested on bigram, trigram and quadgram noun phrases extracted from a 104-million-word text corpus comprised of Medline abstracts. Using the UMLS Metathesaurus as a gold standard, we show that our algorithm substantially outperforms the standard term identification measures and, therefore, qualifies as a high-performing building block for any biomedical terminology mining system.

摘要

生物医学中不断增长的文本信息量需要有效的自动术语提取程序,以帮助生物医学研究人员和专业人员收集和整理编码在文本文献中的术语知识。在本研究中,我们提出了一种基于语言学的新方法,用于从生物医学文献中自动识别多词术语。我们的方法基于术语有限的范式可变性,并在从包含1亿零400万字的Medline摘要文本语料库中提取的双词、三词和四词名词短语上进行了测试。以统一医学语言系统(UMLS)元词表作为金标准,我们表明我们的算法显著优于标准术语识别方法,因此,可作为任何生物医学术语挖掘系统的高性能构建模块。

相似文献

2
Terminology-driven mining of biomedical literature.基于术语驱动的生物医学文献挖掘
Bioinformatics. 2003 May 22;19(8):938-43. doi: 10.1093/bioinformatics/btg105.
3
Recognizing names in biomedical texts: a machine learning approach.识别生物医学文本中的名称:一种机器学习方法。
Bioinformatics. 2004 May 1;20(7):1178-90. doi: 10.1093/bioinformatics/bth060. Epub 2004 Feb 10.
8
Corpus-based statistical screening for phrase identification.基于语料库的短语识别统计筛选
J Am Med Inform Assoc. 2000 Sep-Oct;7(5):499-511. doi: 10.1136/jamia.2000.0070499.
9
Semantic reclassification of the UMLS concepts.统一医学语言系统(UMLS)概念的语义重新分类。
Bioinformatics. 2008 Sep 1;24(17):1971-3. doi: 10.1093/bioinformatics/btn343. Epub 2008 Jul 13.

引用本文的文献

本文引用的文献

1
Term identification in the biomedical literature.生物医学文献中的术语识别。
J Biomed Inform. 2004 Dec;37(6):512-26. doi: 10.1016/j.jbi.2004.08.004.
2
Terminology-driven mining of biomedical literature.基于术语驱动的生物医学文献挖掘
Bioinformatics. 2003 May 22;19(8):938-43. doi: 10.1093/bioinformatics/btg105.
5
GenBank.基因银行
Nucleic Acids Res. 1999 Jan 1;27(1):12-7. doi: 10.1093/nar/27.1.12.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验