Suppr超能文献

基于条件随机场、模糊匹配和字符级建模的宽领域生物医学命名实体识别和标准化。

Wide-scope biomedical named entity recognition and normalization with CRFs, fuzzy matching and character level modeling.

机构信息

Turku Centre for Computer Science, Turku, Finland.

Department of Future Technologies, University of Turku, Turku, Finland.

出版信息

Database (Oxford). 2018 Jan 1;2018:1-10. doi: 10.1093/database/bay096.

Abstract

We present a system for automatically identifying a multitude of biomedical entities from the literature. This work is based on our previous efforts in the BioCreative VI: Interactive Bio-ID Assignment shared task in which our system demonstrated state-of-the-art performance with the highest achieved results in named entity recognition. In this paper we describe the original conditional random field-based system used in the shared task as well as experiments conducted since, including better hyperparameter tuning and character level modeling, which led to further performance improvements. For normalizing the mentions into unique identifiers we use fuzzy character n-gram matching. The normalization approach has also been improved with a better abbreviation resolution method and stricter guideline compliance resulting in vastly improved results for various entity types. All tools and models used for both named entity recognition and normalization are publicly available under open license.Database URL: https://github.com/TurkuNLP/BioCreativeVI_BioID_assignment.

摘要

我们提出了一个从文献中自动识别多种生物医学实体的系统。这项工作基于我们在 BioCreative VI:交互式生物识别分配共享任务中的先前努力,我们的系统在命名实体识别方面取得了最高的最新结果,展示了最先进的性能。在本文中,我们描述了在共享任务中使用的原始基于条件随机场的系统,以及自那以后进行的实验,包括更好的超参数调整和字符级建模,这导致了进一步的性能提升。为了将提及内容规范化为唯一标识符,我们使用模糊字符 n-gram 匹配。规范化方法也得到了改进,采用了更好的缩写解析方法和更严格的指导方针,从而使各种实体类型的结果得到了极大的改善。用于命名实体识别和规范化的所有工具和模型都根据开放许可证在公开提供。数据库 URL:https://github.com/TurkuNLP/BioCreativeVI_BioID_assignment。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0e9c/6146133/8fc912fc8a62/bay096f1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验