• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

使用改进的绿色蟒蛇辅助的基于双向门控循环单元的分层残差神经网络模型进行生物医学命名实体识别。

Biomedical named entity recognition using improved green anaconda-assisted Bi-GRU-based hierarchical ResNet model.

作者信息

Bhushan Ram Chandra, Donthi Rakesh Kumar, Chilukuri Yojitha, Srinivasarao Ulligaddala, Swetha Polisetty

机构信息

Software Architect, Alstom Transport India Limited, Bengaluru, India.

Department of CSE GITAM (Deemed to be) UNIVERSITY Hyderabad, Rudraram, India.

出版信息

BMC Bioinformatics. 2025 Jan 30;26(1):34. doi: 10.1186/s12859-024-06008-w.

DOI:10.1186/s12859-024-06008-w
PMID:39885428
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11780922/
Abstract

BACKGROUND

Biomedical text mining is a technique that extracts essential information from scientific articles using named entity recognition (NER). Traditional NER methods rely on dictionaries, rules, or curated corpora, which may not always be accessible. To overcome these challenges, deep learning (DL) methods have emerged. However, DL-based NER methods may need help identifying long-distance relationships within text and require significant annotated datasets.

RESULTS

This research has proposed a novel model to address the challenges in natural language processing. The Improved Green anaconda-assisted Bi-GRU based Hierarchical ResNet BNER model (IGa-BiHR BNERM) is the model. IGa-BiHR BNERM model has shown promising results in accurately identifying named entities. The MACCROBAT dataset was obtained from Kaggle and underwent several pre-processing steps such as Stop Word Filtering, WordNet processing, Removal of non-alphanumeric characters, stemming Segmentation, and Tokenization, which is standardized and improves its quality. The pre-processed text was fed into a feature extraction model like the Robustly Optimized BERT -Whole Word Masking model. This model provides word embeddings with semantic information. Then, the BNER process utilized an Improved Green Anaconda-assisted Bi-GRU-based Hierarchical ResNet BNER model (IGa-BiHR BNERM).

CONCLUSION

To improve the training phase of the IGa-BiHR BNERM, the Improved Green Anaconda Optimization technique was used to select optimal weight parameter coefficients for training the model parameters. After the model was tested using the MACCROBAT dataset, it outperformed previous models with a tremendous accuracy rate of 99.11%. This model effectively and accurately identifies biomedical names within the text, significantly advancing this field.

摘要

背景

生物医学文本挖掘是一种使用命名实体识别(NER)从科学文章中提取关键信息的技术。传统的NER方法依赖于词典、规则或精选语料库,而这些可能并非总是可用。为了克服这些挑战,深度学习(DL)方法应运而生。然而,基于DL的NER方法在识别文本中的长距离关系时可能会遇到困难,并且需要大量的标注数据集。

结果

本研究提出了一种新颖的模型来应对自然语言处理中的挑战。改进的绿森蚺辅助双向门控循环单元(Bi-GRU)分层残差网络(ResNet)生物医学命名实体识别模型(IGa-BiHR BNERM)就是该模型。IGa-BiHR BNERM模型在准确识别命名实体方面显示出了有前景的结果。MACCROBAT数据集从Kaggle获得,并经过了几个预处理步骤,如停用词过滤、WordNet处理、去除非字母数字字符、词干提取、分词和令牌化,这些步骤使其标准化并提高了质量。预处理后的文本被输入到一个特征提取模型,如稳健优化的BERT - 全词掩码模型。该模型提供带有语义信息的词嵌入。然后,生物医学命名实体识别过程使用了改进的绿森蚺辅助双向门控循环单元分层残差网络生物医学命名实体识别模型(IGa-BiHR BNERM)。

结论

为了改进IGa-BiHR BNERM的训练阶段,使用了改进的绿森蚺优化技术来选择用于训练模型参数的最优权重参数系数。在使用MACCROBAT数据集对模型进行测试后,它以99.11%的极高准确率超越了先前的模型。该模型有效地且准确地识别了文本中的生物医学名称,极大地推动了该领域的发展。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/89e6/11780922/f2a22ced6ae7/12859_2024_6008_Fig12_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/89e6/11780922/f3668581e0c2/12859_2024_6008_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/89e6/11780922/509af418c19a/12859_2024_6008_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/89e6/11780922/7d85dad858e9/12859_2024_6008_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/89e6/11780922/ec453bdbe329/12859_2024_6008_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/89e6/11780922/0f0ff25e5e19/12859_2024_6008_Figa_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/89e6/11780922/1d7e43e3489b/12859_2024_6008_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/89e6/11780922/694ae043f83b/12859_2024_6008_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/89e6/11780922/2d278afc2384/12859_2024_6008_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/89e6/11780922/d0b7704091d6/12859_2024_6008_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/89e6/11780922/700dcd7de338/12859_2024_6008_Fig9_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/89e6/11780922/71e6f4078053/12859_2024_6008_Fig10_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/89e6/11780922/1dedecefc3cf/12859_2024_6008_Fig11_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/89e6/11780922/f2a22ced6ae7/12859_2024_6008_Fig12_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/89e6/11780922/f3668581e0c2/12859_2024_6008_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/89e6/11780922/509af418c19a/12859_2024_6008_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/89e6/11780922/7d85dad858e9/12859_2024_6008_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/89e6/11780922/ec453bdbe329/12859_2024_6008_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/89e6/11780922/0f0ff25e5e19/12859_2024_6008_Figa_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/89e6/11780922/1d7e43e3489b/12859_2024_6008_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/89e6/11780922/694ae043f83b/12859_2024_6008_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/89e6/11780922/2d278afc2384/12859_2024_6008_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/89e6/11780922/d0b7704091d6/12859_2024_6008_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/89e6/11780922/700dcd7de338/12859_2024_6008_Fig9_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/89e6/11780922/71e6f4078053/12859_2024_6008_Fig10_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/89e6/11780922/1dedecefc3cf/12859_2024_6008_Fig11_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/89e6/11780922/f2a22ced6ae7/12859_2024_6008_Fig12_HTML.jpg

相似文献

1
Biomedical named entity recognition using improved green anaconda-assisted Bi-GRU-based hierarchical ResNet model.使用改进的绿色蟒蛇辅助的基于双向门控循环单元的分层残差神经网络模型进行生物医学命名实体识别。
BMC Bioinformatics. 2025 Jan 30;26(1):34. doi: 10.1186/s12859-024-06008-w.
2
Evaluating Medical Entity Recognition in Health Care: Entity Model Quantitative Study.评估医疗保健中的实体识别:实体模型定量研究。
JMIR Med Inform. 2024 Oct 17;12:e59782. doi: 10.2196/59782.
3
An imConvNet-based deep learning model for Chinese medical named entity recognition.基于 imConvNet 的深度学习模型在中文医疗命名实体识别中的应用。
BMC Med Inform Decis Mak. 2022 Nov 21;22(1):303. doi: 10.1186/s12911-022-02049-4.
4
A method for named entity normalization in biomedical articles: application to diseases and plants.一种生物医学文章中命名实体规范化的方法:应用于疾病和植物
BMC Bioinformatics. 2017 Oct 13;18(1):451. doi: 10.1186/s12859-017-1857-8.
5
Biomedical named entity recognition using deep neural networks with contextual information.基于上下文信息的深度神经网络的生物医学命名实体识别。
BMC Bioinformatics. 2019 Dec 27;20(1):735. doi: 10.1186/s12859-019-3321-4.
6
Evaluating word representation features in biomedical named entity recognition tasks.评估生物医学命名实体识别任务中的词表示特征。
Biomed Res Int. 2014;2014:240403. doi: 10.1155/2014/240403. Epub 2014 Mar 6.
7
Extracting comprehensive clinical information for breast cancer using deep learning methods.利用深度学习方法提取乳腺癌全面临床信息。
Int J Med Inform. 2019 Dec;132:103985. doi: 10.1016/j.ijmedinf.2019.103985. Epub 2019 Oct 2.
8
BioBBC: a multi-feature model that enhances the detection of biomedical entities.生物 BBC:一种增强生物医学实体检测的多特征模型。
Sci Rep. 2024 Apr 2;14(1):7697. doi: 10.1038/s41598-024-58334-x.
9
Analyzing transfer learning impact in biomedical cross-lingual named entity recognition and normalization.分析迁移学习在生物医学跨语言命名实体识别和标准化中的影响。
BMC Bioinformatics. 2021 Dec 17;22(Suppl 1):601. doi: 10.1186/s12859-021-04247-9.
10
From zero to hero: Harnessing transformers for biomedical named entity recognition in zero- and few-shot contexts.从零到英雄:利用变压器在零样本和少样本上下文中进行生物医学命名实体识别。
Artif Intell Med. 2024 Oct;156:102970. doi: 10.1016/j.artmed.2024.102970. Epub 2024 Aug 24.

本文引用的文献

1
From zero to hero: Harnessing transformers for biomedical named entity recognition in zero- and few-shot contexts.从零到英雄:利用变压器在零样本和少样本上下文中进行生物医学命名实体识别。
Artif Intell Med. 2024 Oct;156:102970. doi: 10.1016/j.artmed.2024.102970. Epub 2024 Aug 24.
2
BioBBC: a multi-feature model that enhances the detection of biomedical entities.生物 BBC:一种增强生物医学实体检测的多特征模型。
Sci Rep. 2024 Apr 2;14(1):7697. doi: 10.1038/s41598-024-58334-x.
3
Online biomedical named entities recognition by data and knowledge-driven model.
基于数据和知识驱动模型的在线生物医学命名实体识别。
Artif Intell Med. 2024 Apr;150:102813. doi: 10.1016/j.artmed.2024.102813. Epub 2024 Feb 21.
4
Unraveling the impact of nitric oxide, almitrine, and their combination in COVID-19 (at the edge of sepsis) patients: a systematic review.解析一氧化氮、阿米三嗪及其联合用药对新冠肺炎(处于脓毒症边缘)患者的影响:一项系统评价
Front Pharmacol. 2024 Jan 22;14:1172447. doi: 10.3389/fphar.2023.1172447. eCollection 2023.
5
Research on named entity recognition of adverse drug reactions based on NLP and deep learning.基于自然语言处理和深度学习的药物不良反应命名实体识别研究
Front Pharmacol. 2023 Jun 1;14:1121796. doi: 10.3389/fphar.2023.1121796. eCollection 2023.
6
A Review on Electronic Health Record Text-Mining for Biomedical Name Entity Recognition in Healthcare Domain.医疗领域中用于生物医学命名实体识别的电子健康记录文本挖掘综述
Healthcare (Basel). 2023 Apr 28;11(9):1268. doi: 10.3390/healthcare11091268.
7
Multiscale Convolutional Neural Network Based on Channel Space Attention for Gearbox Compound Fault Diagnosis.基于通道空间注意力的多尺度卷积神经网络在齿轮箱复合故障诊断中的应用。
Sensors (Basel). 2023 Apr 8;23(8):3827. doi: 10.3390/s23083827.
8
TaughtNet: Learning Multi-Task Biomedical Named Entity Recognition From Single-Task Teachers.TaughtNet:从单任务教师那里学习多任务生物医学命名实体识别。
IEEE J Biomed Health Inform. 2023 May;27(5):2512-2523. doi: 10.1109/JBHI.2023.3244044. Epub 2023 May 4.
9
Green Anaconda Optimization: A New Bio-Inspired Metaheuristic Algorithm for Solving Optimization Problems.绿森蚺优化算法:一种用于解决优化问题的新型生物启发式元启发式算法。
Biomimetics (Basel). 2023 Mar 14;8(1):121. doi: 10.3390/biomimetics8010121.
10
A prefix and attention map discrimination fusion guided attention for biomedical named entity recognition.前缀和注意力图判别融合引导的生物医学命名实体识别注意力机制。
BMC Bioinformatics. 2023 Feb 8;24(1):42. doi: 10.1186/s12859-023-05172-9.