迈向用于生物语言处理的语义词汇表。

Towards a semantic lexicon for biological language processing.

作者信息

Verspoor Karin

机构信息

Los Alamos National Laboratory, Los Alamos, NM 87545, USA.

出版信息

Comp Funct Genomics. 2005;6(1-2):61-6. doi: 10.1002/cfg.451.

DOI:10.1002/cfg.451

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC2448606/

Abstract

This paper explores the use of the resources in the National Library of Medicine's Unified Medical Language System (UMLS) for the construction of a lexicon useful for processing texts in the field of molecular biology. A lexicon is constructed from overlapping terms in the UMLS SPECIALIST lexicon and the UMLS Metathesaurus to obtain both morphosyntactic and semantic information for terms, and the coverage of a domain corpus is assessed. Over 77% of tokens in the domain corpus are found in the constructed lexicon, validating the lexicon's coverage of the most frequent terms in the domain and indicating that the constructed lexicon is potentially an important resource for biological text processing.

摘要

本文探讨了利用美国国立医学图书馆统一医学语言系统（UMLS）中的资源来构建一个有助于处理分子生物学领域文本的词汇表。该词汇表由UMLS专业词典和UMLS元词表中的重叠术语构建而成，以获取术语的形态句法和语义信息，并评估领域语料库的覆盖范围。结果发现，领域语料库中超过77%的词元出现在构建的词汇表中，这验证了该词汇表对领域中最常见术语的覆盖范围，并表明构建的词汇表可能是生物文本处理的重要资源。

相似文献

1

Towards a semantic lexicon for biological language processing.迈向用于生物语言处理的语义词汇表。

Comp Funct Genomics. 2005;6(1-2):61-6. doi: 10.1002/cfg.451.

2

A semantic lexicon for medical language processing.用于医学语言处理的语义词典。

J Am Med Inform Assoc. 1999 May-Jun;6(3):205-18. doi: 10.1136/jamia.1999.0060205.

3

Towards a semantic lexicon for clinical natural language processing.迈向用于临床自然语言处理的语义词典。

AMIA Annu Symp Proc. 2012;2012:568-76. Epub 2012 Nov 3.

4

MedLexSp - a medical lexicon for Spanish medical natural language processing.MedLexSp- 西班牙语医学自然语言处理的医学词典。

J Biomed Semantics. 2023 Feb 2;14(1):2. doi: 10.1186/s13326-022-00281-5.

5

Corpus-based Approach to Creating a Semantic Lexicon for Clinical Research Eligibility Criteria from UMLS.基于语料库的方法：从统一医学语言系统（UMLS）创建用于临床研究资格标准的语义词典。

Summit Transl Bioinform. 2010 Mar 1;2010:26-30.

6

Siamese KG-LSTM: A deep learning model for enriching UMLS Metathesaurus synonymy.暹罗连体KG-LSTM：一种用于丰富UMLS元词表同义词的深度学习模型。

Int Conf Knowl Syst Eng. 2020 Nov;2020:281-286. doi: 10.1109/kse50997.2020.9287797. Epub 2020 Dec 16.

7

UMLS knowledge for biomedical language processing.用于生物医学语言处理的统一医学语言系统知识。

Bull Med Libr Assoc. 1993 Apr;81(2):184-94.

8

A technique for semantic classification of unknown words using UMLS resources.一种使用统一医学语言系统（UMLS）资源对未知单词进行语义分类的技术。

Proc AMIA Symp. 1999:716-20.

9

Evaluating the UMLS as a source of lexical knowledge for medical language processing.评估作为医学语言处理词汇知识来源的统一医学语言系统（UMLS）。

Proc AMIA Symp. 2001:189-93.

10

Improved identification of noun phrases in clinical radiology reports using a high-performance statistical natural language parser augmented with the UMLS specialist lexicon.使用增强了统一医学语言系统（UMLS）专业词典的高性能统计自然语言解析器，改进临床放射学报告中名词短语的识别。

J Am Med Inform Assoc. 2005 May-Jun;12(3):275-85. doi: 10.1197/jamia.M1695. Epub 2005 Jan 31.

引用本文的文献

1

Identifying named entities from PubMed for enriching semantic categories.从PubMed中识别命名实体以丰富语义类别。

BMC Bioinformatics. 2015 Feb 21;16:57. doi: 10.1186/s12859-015-0487-2.

2

Extracting semantic lexicons from discharge summaries using machine learning and the C-Value method.使用机器学习和C值方法从出院小结中提取语义词典。

AMIA Annu Symp Proc. 2012;2012:409-16. Epub 2012 Nov 3.

3

The BioLexicon: a large-scale terminological resource for biomedical text mining.生物词典：一个用于生物医学文本挖掘的大规模术语资源。

BMC Bioinformatics. 2011 Oct 12;12:397. doi: 10.1186/1471-2105-12-397.

4

Deriving a probabilistic syntacto-semantic grammar for biomedicine based on domain-specific terminologies.基于领域特定术语的生物医学概率句法语义语法推导。

J Biomed Inform. 2011 Oct;44(5):805-14. doi: 10.1016/j.jbi.2011.04.006. Epub 2011 Apr 28.

5

Corpus-based Approach to Creating a Semantic Lexicon for Clinical Research Eligibility Criteria from UMLS.基于语料库的方法：从统一医学语言系统（UMLS）创建用于临床研究资格标准的语义词典。

Summit Transl Bioinform. 2010 Mar 1;2010:26-30.

6

UMLS content views appropriate for NLP processing of the biomedical literature vs. clinical text.UMLS 内容视图适合于生物医学文献与临床文本的自然语言处理。

J Biomed Inform. 2010 Aug;43(4):587-94. doi: 10.1016/j.jbi.2010.02.005. Epub 2010 Feb 10.

7

Ontology quality assurance through analysis of term transformations.通过术语转换分析实现本体质量保证。

Bioinformatics. 2009 Jun 15;25(12):i77-84. doi: 10.1093/bioinformatics/btp195.

本文引用的文献

1

Evaluating UMLS strings for natural language processing.评估用于自然语言处理的统一医学语言系统字符串。

Proc AMIA Symp. 2001:448-52.

2

Evaluating the UMLS as a source of lexical knowledge for medical language processing.评估作为医学语言处理词汇知识来源的统一医学语言系统（UMLS）。

Proc AMIA Symp. 2001:189-93.

3

Gene ontology: tool for the unification of biology. The Gene Ontology Consortium.基因本体论：生物学统一工具。基因本体论联合会。

Nat Genet. 2000 May;25(1):25-9. doi: 10.1038/75556.

4

A semantic lexicon for medical language processing.用于医学语言处理的语义词典。

J Am Med Inform Assoc. 1999 May-Jun;6(3):205-18. doi: 10.1136/jamia.1999.0060205.

5

How knowledge drives understanding--matching medical ontologies with the needs of medical language processing.知识如何推动理解——使医学本体与医学语言处理需求相匹配。

Artif Intell Med. 1999 Jan;15(1):25-51. doi: 10.1016/s0933-3657(98)00044-x.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验