NERChem：通过全词元特征和具有化学子类组成的命名实体特征，使NERBio适用于化学专利。

NERChem: adapting NERBio to chemical patents via full-token features and named entity feature with chemical sub-class composition.

作者信息

Tsai Richard Tzong-Han, Hsiao Yu-Cheng, Lai Po-Ting

出版信息

Database (Oxford). 2016 Oct 25;2016:baw135. doi: 10.1093/database/baw135.

DOI:10.1093/database/baw135

PMID:31414701

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5091336/

Abstract

Chemical patents contain detailed information on novel chemical compounds that is valuable to the chemical and pharmaceutical industries. In this paper, we introduce a system, NERChem that can recognize chemical named entity mentions in chemical patents. NERChem is based on the conditional random fields model (CRF). Our approach incorporates ( 1 ) class composition, which is used for combining chemical classes whose naming conventions are similar; ( 2 ) BioNE features, which are used for distinguishing chemical mentions from other biomedical NE mentions in the patents; and ( 3 ) full-token word features, which are used to resolve the tokenization granularity problem. We evaluated our approach on the BioCreative V CHEMDNER-patent corpus, and achieved an F-score of 87.17% in the Chemical Entity Mention in Patents (CEMP) task and a sensitivity of 98.58% in the Chemical Passage Detection (CPD) task, ranking alongside the top systems. Database URL: Our NERChem web-based system is publicly available at iisrserv.csie.n cu.edu.tw/nerchem.

摘要

化学专利包含有关新型化合物的详细信息，这些信息对化学和制药行业具有重要价值。在本文中，我们介绍了一种名为NERChem的系统，它能够识别化学专利中提及的化学命名实体。NERChem基于条件随机场模型（CRF）。我们的方法包括：（1）类组合，用于组合命名惯例相似的化学类别；（2）BioNE特征，用于在专利中区分化学提及与其他生物医学命名实体提及；（3）全词元词特征，用于解决词元化粒度问题。我们在BioCreative V CHEMDNER-专利语料库上评估了我们的方法，在专利中的化学实体提及（CEMP）任务中获得了87.17%的F值，在化学段落检测（CPD）任务中获得了98.58%的灵敏度，与顶级系统并列。数据库网址：我们基于网络的NERChem系统可在iisrserv.csie.n cu.edu.tw/nerchem上公开获取。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

NERChem：通过全词元特征和具有化学子类组成的命名实体特征，使NERBio适用于化学专利。

NERChem: adapting NERBio to chemical patents via full-token features and named entity feature with chemical sub-class composition.

作者信息

出版信息

相似文献

引用本文的文献

本文引用的文献

NERChem：通过全词元特征和具有化学子类组成的命名实体特征，使NERBio适用于化学专利。

NERChem: adapting NERBio to chemical patents via full-token features and named entity feature with chemical sub-class composition.

作者信息

出版信息

相似文献

引用本文的文献

本文引用的文献