生物医学文献中足月状态的有效分级。

Effective grading of termhood in biomedical literature.

作者信息

Wermter Joachim, Hahn Udo

机构信息

Jena University Language and Information Engineering (JULIE) Lab. http://www.coling.uni-jena.de

出版信息

AMIA Annu Symp Proc. 2005;2005:809-13.

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC1560898/

Abstract

The ever-increasing amount of textual information in biomedicine calls for effective procedures for automatic terminology extraction which assist biomedical researchers and professionals in gathering and organizing terminological knowledge encoded in text documents. In this study, we propose a new, linguistically grounded measure for automatically identifying multi-word terms from the biomedical literature. Our approach is based on the limited paradigmatic modifiability of terms and is tested on bigram, trigram and quadgram noun phrases extracted from a 104-million-word text corpus comprised of Medline abstracts. Using the UMLS Metathesaurus as a gold standard, we show that our algorithm substantially outperforms the standard term identification measures and, therefore, qualifies as a high-performing building block for any biomedical terminology mining system.

摘要

生物医学中不断增长的文本信息量需要有效的自动术语提取程序，以帮助生物医学研究人员和专业人员收集和整理编码在文本文献中的术语知识。在本研究中，我们提出了一种基于语言学的新方法，用于从生物医学文献中自动识别多词术语。我们的方法基于术语有限的范式可变性，并在从包含1亿零400万字的Medline摘要文本语料库中提取的双词、三词和四词名词短语上进行了测试。以统一医学语言系统（UMLS）元词表作为金标准，我们表明我们的算法显著优于标准术语识别方法，因此，可作为任何生物医学术语挖掘系统的高性能构建模块。

相似文献

1

Effective grading of termhood in biomedical literature.生物医学文献中足月状态的有效分级。

AMIA Annu Symp Proc. 2005;2005:809-13.

2

Terminology-driven mining of biomedical literature.基于术语驱动的生物医学文献挖掘

Bioinformatics. 2003 May 22;19(8):938-43. doi: 10.1093/bioinformatics/btg105.

3

Recognizing names in biomedical texts: a machine learning approach.识别生物医学文本中的名称：一种机器学习方法。

Bioinformatics. 2004 May 1;20(7):1178-90. doi: 10.1093/bioinformatics/bth060. Epub 2004 Feb 10.

4

A shallow parser based on closed-class words to capture relations in biomedical text.一种基于封闭类词的浅层句法分析器，用于捕捉生物医学文本中的关系。

J Biomed Inform. 2003 Jun;36(3):145-58. doi: 10.1016/s1532-0464(03)00039-x.

5

Effective mapping of biomedical text to the UMLS Metathesaurus: the MetaMap program.生物医学文本到UMLS元词表的有效映射：MetaMap程序

Proc AMIA Symp. 2001:17-21.

6

Mining molecular binding terminology from biomedical text.从生物医学文本中挖掘分子结合术语。

Proc AMIA Symp. 1999:127-31.

7

Identification of key concepts in biomedical literature using a modified Markov heuristic.使用改进的马尔可夫启发式方法识别生物医学文献中的关键概念。

Bioinformatics. 2003 Feb 12;19(3):402-7. doi: 10.1093/bioinformatics/btg010.

8

Corpus-based statistical screening for phrase identification.基于语料库的短语识别统计筛选

J Am Med Inform Assoc. 2000 Sep-Oct;7(5):499-511. doi: 10.1136/jamia.2000.0070499.

9

Semantic reclassification of the UMLS concepts.统一医学语言系统（UMLS）概念的语义重新分类。

Bioinformatics. 2008 Sep 1;24(17):1971-3. doi: 10.1093/bioinformatics/btn343. Epub 2008 Jul 13.

10

The interaction of domain knowledge and linguistic structure in natural language processing: interpreting hypernymic propositions in biomedical text.自然语言处理中领域知识与语言结构的相互作用：解读生物医学文本中的上位命题

J Biomed Inform. 2003 Dec;36(6):462-77. doi: 10.1016/j.jbi.2003.11.003.

引用本文的文献

1

FlexiTerm: a flexible term recognition method.FlexiTerm：一种灵活的术语识别方法。

J Biomed Semantics. 2013 Oct 10;4(1):27. doi: 10.1186/2041-1480-4-27.

2

Term identification methods for consumer health vocabulary development.用于消费者健康词汇发展的术语识别方法。

J Med Internet Res. 2007 Feb 28;9(1):e4. doi: 10.2196/jmir.9.1.e4.

本文引用的文献

1

Term identification in the biomedical literature.生物医学文献中的术语识别。

J Biomed Inform. 2004 Dec;37(6):512-26. doi: 10.1016/j.jbi.2004.08.004.

2

Terminology-driven mining of biomedical literature.基于术语驱动的生物医学文献挖掘

Bioinformatics. 2003 May 22;19(8):938-43. doi: 10.1093/bioinformatics/btg105.

3

Creating the gene ontology resource: design and implementation.创建基因本体资源：设计与实现

Genome Res. 2001 Aug;11(8):1425-33. doi: 10.1101/gr.180801.

4

Mining molecular binding terminology from biomedical text.从生物医学文本中挖掘分子结合术语。

Proc AMIA Symp. 1999:127-31.

5

GenBank.基因银行

Nucleic Acids Res. 1999 Jan 1;27(1):12-7. doi: 10.1093/nar/27.1.12.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

文档翻译

学术文献翻译模型，支持多种主流文档格式。