• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于XLNet-CRF的降噪学习在生物医学命名实体识别中的应用

Noise Reduction Learning Based on XLNet-CRF for Biomedical Named Entity Recognition.

作者信息

Chai Zhaoying, Jin Han, Shi Shenghui, Zhan Siyan, Zhuo Lin, Yang Yu, Lian Qi

出版信息

IEEE/ACM Trans Comput Biol Bioinform. 2023 Jan-Feb;20(1):595-605. doi: 10.1109/TCBB.2022.3157630. Epub 2023 Feb 3.

DOI:10.1109/TCBB.2022.3157630
PMID:35259113
Abstract

In recent years, Biomedical Named Entity Recognition (BioNER) systems have mainly been based on deep neural networks, which are used to extract information from the rapidly expanding biomedical literature. Long-distance context autoencoding language models based on transformers have recently been employed for BioNER with great success. However, noise interference exists in the process of pre-training and fine-tuning, and there is no effective decoder for label dependency. Current models have many aspects in need of improvement for better performance. We propose two kinds of noise reduction models, Shared Labels and Dynamic Splicing, based on XLNet encoding which is a permutation language pre-training model and decoding by Conditional Random Field (CRF). By testing 15 biomedical named entity recognition datasets, the two models improved the average F1-score by 1.504 and 1.48, respectively, and state-of-the-art performance was achieved on 7 of them. Further analysis proves the effectiveness of the two models and the improvement of the recognition effect of CRF, and suggests the applicable scope of the models according to different data characteristics.

摘要

近年来,生物医学命名实体识别(BioNER)系统主要基于深度神经网络,用于从迅速增长的生物医学文献中提取信息。基于Transformer的长距离上下文自动编码语言模型最近被成功应用于BioNER。然而,在预训练和微调过程中存在噪声干扰,并且没有有效的解码器来处理标签依赖。当前模型在性能提升方面还有很多需要改进的地方。我们基于排列语言预训练模型XLNet编码并通过条件随机场(CRF)解码,提出了两种降噪模型,即共享标签模型和动态拼接模型。通过对15个生物医学命名实体识别数据集进行测试,这两种模型的平均F1分数分别提高了1.504和1.48,其中7个数据集达到了当前最优性能。进一步分析证明了这两种模型的有效性以及CRF识别效果的提升,并根据不同的数据特征给出了模型的适用范围。

相似文献

1
Noise Reduction Learning Based on XLNet-CRF for Biomedical Named Entity Recognition.基于XLNet-CRF的降噪学习在生物医学命名实体识别中的应用
IEEE/ACM Trans Comput Biol Bioinform. 2023 Jan-Feb;20(1):595-605. doi: 10.1109/TCBB.2022.3157630. Epub 2023 Feb 3.
2
DTranNER: biomedical named entity recognition with deep learning-based label-label transition model.DTranNER:基于深度学习的标签-标签转换模型的生物医学命名实体识别。
BMC Bioinformatics. 2020 Feb 11;21(1):53. doi: 10.1186/s12859-020-3393-1.
3
Language model based on deep learning network for biomedical named entity recognition.基于深度学习网络的生物医学命名实体识别语言模型。
Methods. 2024 Jun;226:71-77. doi: 10.1016/j.ymeth.2024.04.013. Epub 2024 Apr 17.
4
Hierarchical shared transfer learning for biomedical named entity recognition.基于层次共享迁移学习的生物医学命名实体识别。
BMC Bioinformatics. 2022 Jan 4;23(1):8. doi: 10.1186/s12859-021-04551-4.
5
Biomedical named entity recognition with the combined feature attention and fully-shared multi-task learning.基于联合特征注意力和全共享多任务学习的生物医学命名实体识别。
BMC Bioinformatics. 2022 Nov 3;23(1):458. doi: 10.1186/s12859-022-04994-3.
6
Augmenting biomedical named entity recognition with general-domain resources.利用通用领域资源增强生物医学命名实体识别。
J Biomed Inform. 2024 Nov;159:104731. doi: 10.1016/j.jbi.2024.104731. Epub 2024 Oct 4.
7
Exploring the effects of drug, disease, and protein dependencies on biomedical named entity recognition: A comparative analysis.探索药物、疾病和蛋白质依赖性对生物医学命名实体识别的影响:一项比较分析。
Front Pharmacol. 2022 Dec 21;13:1020759. doi: 10.3389/fphar.2022.1020759. eCollection 2022.
8
Cross-type biomedical named entity recognition with deep multi-task learning.基于深度多任务学习的跨类型生物医学命名实体识别。
Bioinformatics. 2019 May 15;35(10):1745-1752. doi: 10.1093/bioinformatics/bty869.
9
Adversarial active learning for the identification of medical concepts and annotation inconsistency.对抗式主动学习在医学概念识别和标注不一致性中的应用。
J Biomed Inform. 2020 Aug;108:103481. doi: 10.1016/j.jbi.2020.103481. Epub 2020 Jul 18.
10
Deep learning methods for biomedical named entity recognition: a survey and qualitative comparison.深度学习方法在生物医学命名实体识别中的应用:综述与定性比较。
Brief Bioinform. 2021 Nov 5;22(6). doi: 10.1093/bib/bbab282.

引用本文的文献

1
A Review on Electronic Health Record Text-Mining for Biomedical Name Entity Recognition in Healthcare Domain.医疗领域中用于生物医学命名实体识别的电子健康记录文本挖掘综述
Healthcare (Basel). 2023 Apr 28;11(9):1268. doi: 10.3390/healthcare11091268.
2
Biomedical named entity recognition based on fusion multi-features embedding.基于融合多特征嵌入的生物医学命名实体识别。
Technol Health Care. 2023;31(S1):111-121. doi: 10.3233/THC-236011.