Suppr超能文献

利用基于词典的生物实体名称识别在生物医学文献中的性能。

Exploiting the performance of dictionary-based bio-entity name recognition in biomedical literature.

作者信息

Yang Zhihao, Lin Hongfei, Li Yanpeng

机构信息

Department of Computer Science and Engineering, Dalian University of Technology, 116023 Dalian, China.

出版信息

Comput Biol Chem. 2008 Aug;32(4):287-91. doi: 10.1016/j.compbiolchem.2008.03.008. Epub 2008 Apr 1.

Abstract

Bio-entity name recognition is the key step for information extraction from biomedical literature. This paper presents a dictionary-based bio-entity name recognition approach. The approach expands the bio-entity name dictionary via the Abbreviation Definitions identifying algorithm, improves the recall rate through the improved edit distance algorithm and adopts some post-processing methods including Pre-keyword and Post-keyword expansion, Part of Speech expansion, merge of adjacent bio-entity names and the exploitation of the contextual cues to further improve the performance. Experiment results show that with this approach even an internal dictionary-based system could achieve a fairly good performance.

摘要

生物实体名称识别是从生物医学文献中提取信息的关键步骤。本文提出了一种基于词典的生物实体名称识别方法。该方法通过缩写定义识别算法扩展生物实体名称词典,通过改进的编辑距离算法提高召回率,并采用一些后处理方法,包括关键词前扩展和关键词后扩展、词性扩展、相邻生物实体名称合并以及利用上下文线索来进一步提高性能。实验结果表明,使用这种方法,即使是基于内部词典的系统也能取得相当不错的性能。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验