Suppr超能文献

生物医学领域的词义消歧:综述

Word sense disambiguation in the biomedical domain: an overview.

作者信息

Schuemie Martijn J, Kors Jan A, Mons Barend

机构信息

Biosemantics Group, Medical Informatics Department, Erasmus Medical Center, Dr. Molewaterplein 50, 3015 GE, Rotterdam, The Netherlands.

出版信息

J Comput Biol. 2005 Jun;12(5):554-65. doi: 10.1089/cmb.2005.12.554.

Abstract

There is a trend towards automatic analysis of large amounts of literature in the biomedical domain. However, this can be effective only if the ambiguity in natural language is resolved. In this paper, the current state of research in word sense disambiguation (WSD) is reviewed. Several methods for WSD have already been proposed, but many systems have been tested only on evaluation sets of limited size. There are currently only very few applications of WSD in the biomedical domain. The current direction of research points towards statistically based algorithms that use existing curated data and can be applied to large sets of biomedical literature. There is a need for manually tagged evaluation sets to test WSD algorithms in the biomedical domain. WSD algorithms should preferably be able to take into account both known and unknown senses of a word. Without WSD, automatic metaanalysis of large corpora of text will be error prone.

摘要

生物医学领域存在对大量文献进行自动分析的趋势。然而,只有解决自然语言中的歧义,这才会有效。本文综述了词义消歧(WSD)的研究现状。已经提出了几种WSD方法,但许多系统仅在有限规模的评估集上进行了测试。目前WSD在生物医学领域的应用非常少。当前的研究方向指向基于统计的算法,这些算法使用现有的经过整理的数据,并且可以应用于大量的生物医学文献集。需要手动标注的评估集来测试生物医学领域的WSD算法。WSD算法最好能够兼顾一个词的已知和未知词义。没有WSD,对大型文本语料库的自动元分析将容易出错。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验