• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

通过并行边界检测和类别分类增强生物医学命名实体识别

Enhancing biomedical named entity recognition with parallel boundary detection and category classification.

作者信息

Wang Yu, Tong Hanghang, Zhu Ziye, Hou Fengzhen, Li Yun

机构信息

School of Science, China Pharmaceutical University, Nanjing, China.

Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, IL, USA.

出版信息

BMC Bioinformatics. 2025 Feb 25;26(1):63. doi: 10.1186/s12859-025-06086-4.

DOI:10.1186/s12859-025-06086-4
PMID:40000968
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11863403/
Abstract

BACKGROUND

Named entity recognition is a fundamental task in natural language processing. Recognizing entities in biomedical text, known as the BioNER, is particularly crucial for cutting-edge applications. However, BioNER poses greater challenges compared to traditional NER due to (1) nested structures and (2) category correlations inherent in biomedical entities. Recently, various BioNER models have been developed based on region classification or large language models. Despite being successful, these models still struggle to balance handling nested structures and capturing category knowledge.

RESULTS

We present a novel parallel BioNER model, BEAN, designed to address the unique properties of biomedical entities while achieving a reasonable balance between handling nested structures and incorporating category correlations. Extensive experiments on five public NER datasets, including four biomedical datasets, demonstrate that BEAN achieves state-of-the-art performance.

CONCLUSIONS

The proposed BEAN is elaborately designed to achieve two key objectives of the BioNER task: clearly detecting entity boundaries and correctly classifying entity categories. It is the first BioNER model to handle nested structures and category correlations in parallel. We exploit head, tail, and contextualized features to efficiently detect entity boundaries via a triaffine model. To the best of our knowledge, we are the first to introduce a multi-label classification model for the BioNER task to extract entity category information without boundary guidance.

摘要

背景

命名实体识别是自然语言处理中的一项基本任务。在生物医学文本中识别实体,即生物命名实体识别(BioNER),对于前沿应用尤为关键。然而,由于(1)嵌套结构和(2)生物医学实体固有的类别相关性,BioNER相比传统命名实体识别(NER)带来了更大的挑战。最近,基于区域分类或大语言模型开发了各种BioNER模型。尽管取得了成功,但这些模型在平衡处理嵌套结构和捕捉类别知识方面仍存在困难。

结果

我们提出了一种新颖的并行BioNER模型BEAN,旨在解决生物医学实体的独特属性,同时在处理嵌套结构和纳入类别相关性之间实现合理平衡。在五个公共NER数据集上进行的广泛实验,包括四个生物医学数据集,表明BEAN取得了领先的性能。

结论

所提出的BEAN经过精心设计,以实现BioNER任务的两个关键目标:清晰检测实体边界并正确分类实体类别。它是第一个并行处理嵌套结构和类别相关性的BioNER模型。我们利用头部、尾部和上下文特征,通过三仿射模型有效地检测实体边界。据我们所知,我们是第一个为BioNER任务引入多标签分类模型,在无边界指导的情况下提取实体类别信息的。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/216f/11863403/9bfb5e5e7592/12859_2025_6086_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/216f/11863403/0b451f4a7027/12859_2025_6086_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/216f/11863403/837d573bc570/12859_2025_6086_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/216f/11863403/789b611def30/12859_2025_6086_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/216f/11863403/9bfb5e5e7592/12859_2025_6086_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/216f/11863403/0b451f4a7027/12859_2025_6086_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/216f/11863403/837d573bc570/12859_2025_6086_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/216f/11863403/789b611def30/12859_2025_6086_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/216f/11863403/9bfb5e5e7592/12859_2025_6086_Fig4_HTML.jpg

相似文献

1
Enhancing biomedical named entity recognition with parallel boundary detection and category classification.通过并行边界检测和类别分类增强生物医学命名实体识别
BMC Bioinformatics. 2025 Feb 25;26(1):63. doi: 10.1186/s12859-025-06086-4.
2
Augmenting biomedical named entity recognition with general-domain resources.利用通用领域资源增强生物医学命名实体识别。
J Biomed Inform. 2024 Nov;159:104731. doi: 10.1016/j.jbi.2024.104731. Epub 2024 Oct 4.
3
Biomedical named entity recognition with the combined feature attention and fully-shared multi-task learning.基于联合特征注意力和全共享多任务学习的生物医学命名实体识别。
BMC Bioinformatics. 2022 Nov 3;23(1):458. doi: 10.1186/s12859-022-04994-3.
4
Multitask learning for biomedical named entity recognition with cross-sharing structure.基于交叉共享结构的生物医学命名实体识别的多任务学习。
BMC Bioinformatics. 2019 Aug 16;20(1):427. doi: 10.1186/s12859-019-3000-5.
5
BioByGANS: biomedical named entity recognition by fusing contextual and syntactic features through graph attention network in node classification framework.BioByGANS:通过图注意力网络在节点分类框架中融合上下文和句法特征进行生物医学命名实体识别。
BMC Bioinformatics. 2022 Nov 22;23(1):501. doi: 10.1186/s12859-022-05051-9.
6
Augmenting Biomedical Named Entity Recognition with General-domain Resources.利用通用领域资源增强生物医学命名实体识别
ArXiv. 2024 Dec 30:arXiv:2406.10671v4.
7
Language model based on deep learning network for biomedical named entity recognition.基于深度学习网络的生物医学命名实体识别语言模型。
Methods. 2024 Jun;226:71-77. doi: 10.1016/j.ymeth.2024.04.013. Epub 2024 Apr 17.
8
DTranNER: biomedical named entity recognition with deep learning-based label-label transition model.DTranNER:基于深度学习的标签-标签转换模型的生物医学命名实体识别。
BMC Bioinformatics. 2020 Feb 11;21(1):53. doi: 10.1186/s12859-020-3393-1.
9
AIONER: all-in-one scheme-based biomedical named entity recognition using deep learning.AIONER:基于整体方案的深度学习生物医学命名实体识别。
Bioinformatics. 2023 May 4;39(5). doi: 10.1093/bioinformatics/btad310.
10
Comparing general and specialized word embeddings for biomedical named entity recognition.比较用于生物医学命名实体识别的通用词嵌入和专用词嵌入。
PeerJ Comput Sci. 2021 Feb 18;7:e384. doi: 10.7717/peerj-cs.384. eCollection 2021.

本文引用的文献

1
Advancing entity recognition in biomedicine via instruction tuning of large language models.通过指令调整大型语言模型推进生物医学中的实体识别。
Bioinformatics. 2024 Mar 29;40(4). doi: 10.1093/bioinformatics/btae163.
2
Towards electronic health record-based medical knowledge graph construction, completion, and applications: A literature study.面向电子健康记录的医学知识图谱构建、补全与应用:文献研究。
J Biomed Inform. 2023 Jul;143:104403. doi: 10.1016/j.jbi.2023.104403. Epub 2023 May 24.
3
AIONER: all-in-one scheme-based biomedical named entity recognition using deep learning.
AIONER:基于整体方案的深度学习生物医学命名实体识别。
Bioinformatics. 2023 May 4;39(5). doi: 10.1093/bioinformatics/btad310.
4
A prefix and attention map discrimination fusion guided attention for biomedical named entity recognition.前缀和注意力图判别融合引导的生物医学命名实体识别注意力机制。
BMC Bioinformatics. 2023 Feb 8;24(1):42. doi: 10.1186/s12859-023-05172-9.
5
MLM-based typographical error correction of unstructured medical texts for named entity recognition.基于 MLM 的非结构化医疗文本命名实体识别的排版错误校正。
BMC Bioinformatics. 2022 Nov 16;23(1):486. doi: 10.1186/s12859-022-05035-9.
6
Improving deep learning method for biomedical named entity recognition by using entity definition information.利用实体定义信息改进生物医学命名实体识别的深度学习方法。
BMC Bioinformatics. 2021 Dec 17;22(Suppl 1):600. doi: 10.1186/s12859-021-04236-y.
7
COVID-19 information retrieval with deep-learning based semantic search, question answering, and abstractive summarization.基于深度学习的语义搜索、问答和摘要生成技术进行的COVID-19信息检索
NPJ Digit Med. 2021 Apr 12;4(1):68. doi: 10.1038/s41746-021-00437-0.
8
Knowledge-enhanced biomedical named entity recognition and normalization: application to proteins and genes.基于知识增强的生物医学命名实体识别与规范:在蛋白质和基因上的应用。
BMC Bioinformatics. 2020 Jan 30;21(1):35. doi: 10.1186/s12859-020-3375-3.
9
BioBERT: a pre-trained biomedical language representation model for biomedical text mining.BioBERT:一种用于生物医学文本挖掘的预训练生物医学语言表示模型。
Bioinformatics. 2020 Feb 15;36(4):1234-1240. doi: 10.1093/bioinformatics/btz682.
10
Focal Loss for Dense Object Detection.用于密集目标检测的焦散损失
IEEE Trans Pattern Anal Mach Intell. 2020 Feb;42(2):318-327. doi: 10.1109/TPAMI.2018.2858826. Epub 2018 Jul 23.