• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于注意力机制的 BiLSTM-CRF 方法在文档级化学命名实体识别中的应用。

An attention-based BiLSTM-CRF approach to document-level chemical named entity recognition.

机构信息

College of Computer Science and Technology, Dalian University of Technology, Dalian 116024, China.

Beijing Institute of Health Administration and Medical Information, Beijing 100850, China.

出版信息

Bioinformatics. 2018 Apr 15;34(8):1381-1388. doi: 10.1093/bioinformatics/btx761.

DOI:10.1093/bioinformatics/btx761
PMID:29186323
Abstract

MOTIVATION

In biomedical research, chemical is an important class of entities, and chemical named entity recognition (NER) is an important task in the field of biomedical information extraction. However, most popular chemical NER methods are based on traditional machine learning and their performances are heavily dependent on the feature engineering. Moreover, these methods are sentence-level ones which have the tagging inconsistency problem.

RESULTS

In this paper, we propose a neural network approach, i.e. attention-based bidirectional Long Short-Term Memory with a conditional random field layer (Att-BiLSTM-CRF), to document-level chemical NER. The approach leverages document-level global information obtained by attention mechanism to enforce tagging consistency across multiple instances of the same token in a document. It achieves better performances with little feature engineering than other state-of-the-art methods on the BioCreative IV chemical compound and drug name recognition (CHEMDNER) corpus and the BioCreative V chemical-disease relation (CDR) task corpus (the F-scores of 91.14 and 92.57%, respectively).

AVAILABILITY AND IMPLEMENTATION

Data and code are available at https://github.com/lingluodlut/Att-ChemdNER.

CONTACT

yangzh@dlut.edu.cn or wangleibihami@gmail.com.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

在生物医学研究中,化学是一类重要的实体,化学命名实体识别(NER)是生物医学信息提取领域的一项重要任务。然而,大多数流行的化学 NER 方法基于传统的机器学习,其性能严重依赖于特征工程。此外,这些方法是基于句子级别的,存在标签不一致的问题。

结果

在本文中,我们提出了一种基于神经网络的方法,即基于注意力的双向长短时记忆与条件随机场层(Att-BiLSTM-CRF),用于文档级别的化学 NER。该方法利用注意力机制获得的文档级全局信息,强制对文档中同一标记的多个实例进行标签一致性。与其他最先进的方法相比,该方法在 BioCreative IV 化学化合物和药物名称识别(CHEMDNER)语料库和 BioCreative V 化学-疾病关系(CDR)任务语料库上取得了更好的性能,无需进行大量特征工程(F 分数分别为 91.14%和 92.57%)。

可用性和实现

数据和代码可在 https://github.com/lingluodlut/Att-ChemdNER 上获取。

联系人

yangzh@dlut.edu.cn 或 wangleibihami@gmail.com。

补充信息

补充数据可在生物信息学在线获得。

相似文献

1
An attention-based BiLSTM-CRF approach to document-level chemical named entity recognition.基于注意力机制的 BiLSTM-CRF 方法在文档级化学命名实体识别中的应用。
Bioinformatics. 2018 Apr 15;34(8):1381-1388. doi: 10.1093/bioinformatics/btx761.
2
Document-level attention-based BiLSTM-CRF incorporating disease dictionary for disease named entity recognition.基于文档级注意力的 BiLSTM-CRF 结合疾病词典的疾病命名实体识别。
Comput Biol Med. 2019 May;108:122-132. doi: 10.1016/j.compbiomed.2019.04.002. Epub 2019 Apr 7.
3
Biomedical named entity recognition using deep neural networks with contextual information.基于上下文信息的深度神经网络的生物医学命名实体识别。
BMC Bioinformatics. 2019 Dec 27;20(1):735. doi: 10.1186/s12859-019-3321-4.
4
D3NER: biomedical named entity recognition using CRF-biLSTM improved with fine-tuned embeddings of various linguistic information.D3NER:基于条件随机场-双向长短期记忆网络的生物医学命名实体识别,通过各种语言信息的微调嵌入得到改进。
Bioinformatics. 2018 Oct 15;34(20):3539-3546. doi: 10.1093/bioinformatics/bty356.
5
LSTMVoter: chemical named entity recognition using a conglomerate of sequence labeling tools.LSTMVoter:使用序列标注工具集合进行化学命名实体识别。
J Cheminform. 2019 Jan 10;11(1):3. doi: 10.1186/s13321-018-0327-2.
6
Long short-term memory RNN for biomedical named entity recognition.用于生物医学命名实体识别的长短期记忆循环神经网络
BMC Bioinformatics. 2017 Oct 30;18(1):462. doi: 10.1186/s12859-017-1868-5.
7
Disease named entity recognition by combining conditional random fields and bidirectional recurrent neural networks.结合条件随机场和双向递归神经网络的疾病命名实体识别
Database (Oxford). 2016 Oct 24;2016. doi: 10.1093/database/baw140. Print 2016.
8
DTranNER: biomedical named entity recognition with deep learning-based label-label transition model.DTranNER:基于深度学习的标签-标签转换模型的生物医学命名实体识别。
BMC Bioinformatics. 2020 Feb 11;21(1):53. doi: 10.1186/s12859-020-3393-1.
9
Deep learning with word embeddings improves biomedical named entity recognition.使用词嵌入的深度学习可改善生物医学命名实体识别。
Bioinformatics. 2017 Jul 15;33(14):i37-i48. doi: 10.1093/bioinformatics/btx228.
10
Towards reliable named entity recognition in the biomedical domain.迈向生物医学领域可靠的命名实体识别
Bioinformatics. 2020 Jan 1;36(1):280-286. doi: 10.1093/bioinformatics/btz504.

引用本文的文献

1
The Deep Mining Era: Genomic, Metabolomic, and Integrative Approaches to Microbial Natural Products from 2018 to 2024.深度挖掘时代:2018年至2024年微生物天然产物的基因组学、代谢组学及综合方法
Mar Drugs. 2025 Jun 23;23(7):261. doi: 10.3390/md23070261.
2
Optimizing document management and retrieval with multimodal transformers and knowledge graphs.利用多模态变换器和知识图谱优化文档管理与检索。
PLoS One. 2025 Jun 11;20(6):e0323966. doi: 10.1371/journal.pone.0323966. eCollection 2025.
3
Entity perception of Two-Step-Matching framework for public opinions.
舆情两步匹配框架的实体感知
J Saf Sci Resil. 2020 Sep;1(1):36-43. doi: 10.1016/j.jnlssr.2020.06.005. Epub 2020 Jun 30.
4
A diffusion enhanced CRF and BiLSTM framework for accurate entity recognition.一种用于精确实体识别的扩散增强条件随机场和双向长短期记忆网络框架。
Sci Rep. 2025 Jun 4;15(1):19670. doi: 10.1038/s41598-025-04036-x.
5
Few-shot biomedical NER empowered by LLMs-assisted data augmentation and multi-scale feature extraction.由大语言模型辅助数据增强和多尺度特征提取赋能的少样本生物医学命名实体识别
BioData Min. 2025 Apr 4;18(1):28. doi: 10.1186/s13040-025-00443-y.
6
Accurate disaster entity recognition based on contextual embeddings in self-attentive BiLSTM-CRF.基于自注意力双向长短时记忆条件随机场中上下文嵌入的准确灾害实体识别。
PLoS One. 2025 Mar 26;20(3):e0318262. doi: 10.1371/journal.pone.0318262. eCollection 2025.
7
Exploiting question-answer framework with multi-GRU to detect adverse drug reaction on social media.利用带有多门控循环单元的问答框架来检测社交媒体上的药物不良反应。
Sci Rep. 2025 Feb 4;15(1):4157. doi: 10.1038/s41598-025-87724-y.
8
Predicting CRISPR-Cas9 off-target effects in human primary cells using bidirectional LSTM with BERT embedding.使用具有BERT嵌入的双向长短期记忆网络预测人类原代细胞中的CRISPR-Cas9脱靶效应。
Bioinform Adv. 2024 Dec 30;5(1):vbae184. doi: 10.1093/bioadv/vbae184. eCollection 2025.
9
Research on the construction of a knowledge graph for tomato leaf pests and diseases based on the named entity recognition model.基于命名实体识别模型的番茄叶病虫害知识图谱构建研究
Front Plant Sci. 2024 Nov 7;15:1482275. doi: 10.3389/fpls.2024.1482275. eCollection 2024.
10
Alzheimer's Disease Knowledge Graph Enhances Knowledge Discovery and Disease Prediction.阿尔茨海默病知识图谱增强知识发现与疾病预测。
bioRxiv. 2024 Jul 5:2024.07.03.601339. doi: 10.1101/2024.07.03.601339.