• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

使用基于混合方法的生物医学文献文档命名实体识别。

Named entity recognition on bio-medical literature documents using hybrid based approach.

作者信息

Ramachandran R, Arutchelvan K

机构信息

Department of Computer and Information Science, Annamalai University, Tamil Nadu, Chidambaram, India.

出版信息

J Ambient Intell Humaniz Comput. 2021 Mar 11:1-10. doi: 10.1007/s12652-021-03078-z.

DOI:10.1007/s12652-021-03078-z
PMID:33723489
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7947151/
Abstract

There have been many changes in the medical field due to technological advances. The progression in technologies provides lot of opportunities to extract valuable insights from huge amount of unstructured data. The literature documents published by the researchers in medical domain consists enormous amount of knowledge. Many organizations are involving in retrieving the hidden information from the literature documents. Extracting the drug names, diseases, symptoms, route of administration, species and dosage forms from the textual document is an easy task due to the innovation of technologies in the Natural Language Processing. In this article, a new hybrid based approach is proposed to identify named entity from the medical literature documents. New dictionary has been built for route of administration, dosage forms and symptoms to annotate the entities in the medical documents. The annotated entities are trained by the blank Spacy machine learning model. The trained model provide a decent accuracy when compared with the existing model. The hybrid model is validated with the dictionary and human (optional)to calculate the confusion matrix. It is able to identify more entities than the prevailing model. The average F1 score for five entities of the proposed hybrid based approach 73.79%.

摘要

由于技术进步,医学领域发生了许多变化。技术的进步为从大量非结构化数据中提取有价值的见解提供了很多机会。医学领域研究人员发表的文献记录包含了大量知识。许多组织都在致力于从文献记录中检索隐藏信息。由于自然语言处理技术的创新,从文本文件中提取药物名称、疾病、症状、给药途径、物种和剂型是一项容易的任务。在本文中,提出了一种新的基于混合的方法来从医学文献记录中识别命名实体。已经为给药途径、剂型和症状建立了新的词典,以注释医学文档中的实体。带注释的实体由空白的Spacy机器学习模型进行训练。与现有模型相比,训练后的模型具有相当不错的准确率。混合模型通过词典和人工(可选)进行验证,以计算混淆矩阵。它能够识别比现有模型更多的实体。所提出的基于混合方法的五个实体的平均F1分数为73.79%。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8a97/7947151/b3a4f6f51501/12652_2021_3078_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8a97/7947151/5bd10e88668d/12652_2021_3078_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8a97/7947151/7cf62362ebdf/12652_2021_3078_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8a97/7947151/da4ce015adf4/12652_2021_3078_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8a97/7947151/d9a2ed18536e/12652_2021_3078_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8a97/7947151/b3a4f6f51501/12652_2021_3078_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8a97/7947151/5bd10e88668d/12652_2021_3078_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8a97/7947151/7cf62362ebdf/12652_2021_3078_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8a97/7947151/da4ce015adf4/12652_2021_3078_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8a97/7947151/d9a2ed18536e/12652_2021_3078_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8a97/7947151/b3a4f6f51501/12652_2021_3078_Fig5_HTML.jpg

相似文献

1
Named entity recognition on bio-medical literature documents using hybrid based approach.使用基于混合方法的生物医学文献文档命名实体识别。
J Ambient Intell Humaniz Comput. 2021 Mar 11:1-10. doi: 10.1007/s12652-021-03078-z.
2
Extracting clinical named entity for pituitary adenomas from Chinese electronic medical records.从中文电子病历中提取垂体腺瘤的临床命名实体。
BMC Med Inform Decis Mak. 2022 Mar 23;22(1):72. doi: 10.1186/s12911-022-01810-z.
3
Learning adaptive representations for entity recognition in the biomedical domain.学习生物医学领域中实体识别的自适应表示。
J Biomed Semantics. 2021 May 17;12(1):10. doi: 10.1186/s13326-021-00238-0.
4
Terminologies augmented recurrent neural network model for clinical named entity recognition.基于扩充术语的循环神经网络模型在临床命名实体识别中的应用。
J Biomed Inform. 2020 Feb;102:103356. doi: 10.1016/j.jbi.2019.103356. Epub 2019 Dec 16.
5
MLM-based typographical error correction of unstructured medical texts for named entity recognition.基于 MLM 的非结构化医疗文本命名实体识别的排版错误校正。
BMC Bioinformatics. 2022 Nov 16;23(1):486. doi: 10.1186/s12859-022-05035-9.
6
A deep learning model incorporating part of speech and self-matching attention for named entity recognition of Chinese electronic medical records.基于词性和自匹配注意力的深度学习模型在中文电子病历命名实体识别中的应用。
BMC Med Inform Decis Mak. 2019 Apr 9;19(Suppl 2):65. doi: 10.1186/s12911-019-0762-7.
7
Active learning for ontological event extraction incorporating named entity recognition and unknown word handling.结合命名实体识别和未知词处理的本体事件抽取的主动学习
J Biomed Semantics. 2016 Apr 27;7:22. doi: 10.1186/s13326-016-0059-z. eCollection 2016.
8
Linking entities through an ontology using word embeddings and syntactic re-ranking.通过使用词向量和句法重新排序将实体链接到本体中。
BMC Bioinformatics. 2019 Mar 27;20(1):156. doi: 10.1186/s12859-019-2678-8.
9
Chemical named entities recognition: a review on approaches and applications.化学命名实体识别:方法与应用综述
J Cheminform. 2014 Apr 28;6:17. doi: 10.1186/1758-2946-6-17. eCollection 2014.
10
A document processing pipeline for annotating chemical entities in scientific documents.用于在科学文献中标记化学实体的文档处理管道。
J Cheminform. 2015 Jan 19;7(Suppl 1 Text mining for chemistry and the CHEMDNER track):S7. doi: 10.1186/1758-2946-7-S1-S7. eCollection 2015.

引用本文的文献

1
Biomedical named entity recognition using improved green anaconda-assisted Bi-GRU-based hierarchical ResNet model.使用改进的绿色蟒蛇辅助的基于双向门控循环单元的分层残差神经网络模型进行生物医学命名实体识别。
BMC Bioinformatics. 2025 Jan 30;26(1):34. doi: 10.1186/s12859-024-06008-w.
2
Toward an open access genomics database of South Africans: ethical considerations.迈向南非人开放获取基因组数据库:伦理考量
Front Genet. 2023 May 16;14:1166029. doi: 10.3389/fgene.2023.1166029. eCollection 2023.
3
Lessons learned to enable question answering on knowledge graphs extracted from scientific publications: A case study on the coronavirus literature.

本文引用的文献

1
Cross domains adversarial learning for Chinese named entity recognition for online medical consultation.在线医疗咨询中面向中文命名实体识别的跨领域对抗学习。
J Biomed Inform. 2020 Dec;112:103608. doi: 10.1016/j.jbi.2020.103608. Epub 2020 Oct 23.
2
Character level and word level embedding with bidirectional LSTM - Dynamic recurrent neural network for biomedical named entity recognition from literature.基于字符和词的双向 LSTM 嵌入 - 用于从文献中识别生物医学命名实体的动态递归神经网络。
J Biomed Inform. 2020 Dec;112:103609. doi: 10.1016/j.jbi.2020.103609. Epub 2020 Oct 26.
3
Transfer learning for biomedical named entity recognition with neural networks.
从科学出版物中提取的知识图谱上实现问答的经验教训:以冠状病毒文献为例。
J Biomed Inform. 2023 Jun;142:104382. doi: 10.1016/j.jbi.2023.104382. Epub 2023 May 6.
4
Artificial Intelligence and Cardiovascular Genetics.人工智能与心血管遗传学
Life (Basel). 2022 Feb 14;12(2):279. doi: 10.3390/life12020279.
基于神经网络的生物医学命名实体识别的迁移学习。
Bioinformatics. 2018 Dec 1;34(23):4087-4094. doi: 10.1093/bioinformatics/bty449.
4
Exploiting and assessing multi-source data for supervised biomedical named entity recognition.利用和评估多源数据进行有监督的生物医学命名实体识别。
Bioinformatics. 2018 Jul 15;34(14):2474-2482. doi: 10.1093/bioinformatics/bty152.
5
Disease named entity recognition from biomedical literature using a novel convolutional neural network.基于新型卷积神经网络的生物医学文献疾病命名实体识别
BMC Med Genomics. 2017 Dec 28;10(Suppl 5):73. doi: 10.1186/s12920-017-0316-8.
6
GRAM-CNN: a deep learning approach with local context for named entity recognition in biomedical text.GRAM-CNN:一种基于局部上下文的深度学习方法,用于生物医学文本中的命名实体识别。
Bioinformatics. 2018 May 1;34(9):1547-1554. doi: 10.1093/bioinformatics/btx815.
7
A method for named entity normalization in biomedical articles: application to diseases and plants.一种生物医学文章中命名实体规范化的方法:应用于疾病和植物
BMC Bioinformatics. 2017 Oct 13;18(1):451. doi: 10.1186/s12859-017-1857-8.
8
Deep learning with word embeddings improves biomedical named entity recognition.使用词嵌入的深度学习可改善生物医学命名实体识别。
Bioinformatics. 2017 Jul 15;33(14):i37-i48. doi: 10.1093/bioinformatics/btx228.
9
Character-level neural network for biomedical named entity recognition.用于生物医学命名实体识别的字符级神经网络。
J Biomed Inform. 2017 Jun;70:85-91. doi: 10.1016/j.jbi.2017.05.002. Epub 2017 May 11.
10
DNorm: disease name normalization with pairwise learning to rank.DNorm:基于对分学习排序的疾病名称标准化。
Bioinformatics. 2013 Nov 15;29(22):2909-17. doi: 10.1093/bioinformatics/btt474. Epub 2013 Aug 21.