• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于新型卷积神经网络的生物医学文献疾病命名实体识别

Disease named entity recognition from biomedical literature using a novel convolutional neural network.

机构信息

College of Computer Science and Technology, Dalian University of Technology, Dalian, 116023, China.

Beijing Institute of Health Administration and Medical Information, Beijing, 100850, China.

出版信息

BMC Med Genomics. 2017 Dec 28;10(Suppl 5):73. doi: 10.1186/s12920-017-0316-8.

DOI:10.1186/s12920-017-0316-8
PMID:29297367
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5751782/
Abstract

BACKGROUND

Automatic disease named entity recognition (DNER) is of utmost importance for development of more sophisticated BioNLP tools. However, most conventional CRF based DNER systems rely on well-designed features whose selection is labor intensive and time-consuming. Though most deep learning methods can solve NER problems with little feature engineering, they employ additional CRF layer to capture the correlation information between labels in neighborhoods which makes them much complicated.

METHODS

In this paper, we propose a novel multiple label convolutional neural network (MCNN) based disease NER approach. In this approach, instead of the CRF layer, a multiple label strategy (MLS) first introduced by us, is employed. First, the character-level embedding, word-level embedding and lexicon feature embedding are concatenated. Then several convolutional layers are stacked over the concatenated embedding. Finally, MLS strategy is applied to the output layer to capture the correlation information between neighboring labels.

RESULTS

As shown by the experimental results, MCNN can achieve the state-of-the-art performance on both NCBI and CDR corpora.

CONCLUSIONS

The proposed MCNN based disease NER method achieves the state-of-the-art performance with little feature engineering. And the experimental results show the MLS strategy's effectiveness of capturing the correlation information between labels in the neighborhood.

摘要

背景

自动疾病命名实体识别(DNER)对于开发更复杂的生物自然语言处理工具至关重要。然而,大多数基于条件随机场(CRF)的 DNER 系统依赖于精心设计的特征,其选择既费力又耗时。尽管大多数深度学习方法可以在很少进行特征工程的情况下解决 NER 问题,但它们采用了额外的 CRF 层来捕获标签邻域之间的相关性信息,这使得它们变得更加复杂。

方法

在本文中,我们提出了一种新颖的基于多标签卷积神经网络(MCNN)的疾病 NER 方法。在该方法中,我们首先引入了一种多标签策略(MLS),而不是 CRF 层。首先,将字符级嵌入、单词级嵌入和词典特征嵌入连接起来。然后,堆叠几个卷积层在连接的嵌入上。最后,将 MLS 策略应用于输出层,以捕获标签邻域之间的相关性信息。

结果

实验结果表明,MCNN 在 NCBI 和 CDR 语料库上都能达到最先进的性能。

结论

所提出的基于 MCNN 的疾病 NER 方法在很少进行特征工程的情况下就能达到最先进的性能。实验结果表明,MLS 策略在捕获标签邻域之间的相关性信息方面是有效的。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b5fb/5751782/1d79f0471851/12920_2017_316_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b5fb/5751782/b368d39a84d4/12920_2017_316_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b5fb/5751782/ccfcf1eeead3/12920_2017_316_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b5fb/5751782/1d79f0471851/12920_2017_316_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b5fb/5751782/b368d39a84d4/12920_2017_316_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b5fb/5751782/ccfcf1eeead3/12920_2017_316_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b5fb/5751782/1d79f0471851/12920_2017_316_Fig3_HTML.jpg

相似文献

1
Disease named entity recognition from biomedical literature using a novel convolutional neural network.基于新型卷积神经网络的生物医学文献疾病命名实体识别
BMC Med Genomics. 2017 Dec 28;10(Suppl 5):73. doi: 10.1186/s12920-017-0316-8.
2
Biomedical named entity recognition using deep neural networks with contextual information.基于上下文信息的深度神经网络的生物医学命名实体识别。
BMC Bioinformatics. 2019 Dec 27;20(1):735. doi: 10.1186/s12859-019-3321-4.
3
DTranNER: biomedical named entity recognition with deep learning-based label-label transition model.DTranNER:基于深度学习的标签-标签转换模型的生物医学命名实体识别。
BMC Bioinformatics. 2020 Feb 11;21(1):53. doi: 10.1186/s12859-020-3393-1.
4
An attention-based BiLSTM-CRF approach to document-level chemical named entity recognition.基于注意力机制的 BiLSTM-CRF 方法在文档级化学命名实体识别中的应用。
Bioinformatics. 2018 Apr 15;34(8):1381-1388. doi: 10.1093/bioinformatics/btx761.
5
Combinatorial feature embedding based on CNN and LSTM for biomedical named entity recognition.基于 CNN 和 LSTM 的组合特征嵌入的生物医学命名实体识别。
J Biomed Inform. 2020 Mar;103:103381. doi: 10.1016/j.jbi.2020.103381. Epub 2020 Jan 28.
6
Disease named entity recognition using long-short dependencies.使用长短时记忆网络的疾病命名实体识别
J Bioinform Comput Biol. 2020 Jun;18(3):2050015. doi: 10.1142/S0219720020500158. Epub 2020 Jun 5.
7
Comparison of named entity recognition methodologies in biomedical documents.生物医学文献中命名实体识别方法的比较。
Biomed Eng Online. 2018 Nov 6;17(Suppl 2):158. doi: 10.1186/s12938-018-0573-6.
8
GRAM-CNN: a deep learning approach with local context for named entity recognition in biomedical text.GRAM-CNN:一种基于局部上下文的深度学习方法,用于生物医学文本中的命名实体识别。
Bioinformatics. 2018 May 1;34(9):1547-1554. doi: 10.1093/bioinformatics/btx815.
9
SBLC: a hybrid model for disease named entity recognition based on semantic bidirectional LSTMs and conditional random fields.SBLC:一种基于语义双向 LSTM 和条件随机场的疾病命名实体识别混合模型。
BMC Med Inform Decis Mak. 2018 Dec 7;18(Suppl 5):114. doi: 10.1186/s12911-018-0690-y.
10
An imConvNet-based deep learning model for Chinese medical named entity recognition.基于 imConvNet 的深度学习模型在中文医疗命名实体识别中的应用。
BMC Med Inform Decis Mak. 2022 Nov 21;22(1):303. doi: 10.1186/s12911-022-02049-4.

引用本文的文献

1
Early Predicting Tribocorrosion Rate of Dental Implant Titanium Materials Using Random Forest Machine Learning Models.使用随机森林机器学习模型早期预测牙科种植体钛材料的摩擦腐蚀速率
Tribol Int. 2023 Sep;187. doi: 10.1016/j.triboint.2023.108735. Epub 2023 Jun 26.
2
Constructing a disease database and using natural language processing to capture and standardize free text clinical information.构建疾病数据库并使用自然语言处理技术来捕获和规范自由文本临床信息。
Sci Rep. 2023 May 26;13(1):8591. doi: 10.1038/s41598-023-35482-0.
3
Clinical Application of Detecting COVID-19 Risks: A Natural Language Processing Approach.

本文引用的文献

1
HITSZ_CDR: an end-to-end chemical and disease relation extraction system for BioCreative V.哈尔滨工业大学深圳校区的化学与疾病关系抽取系统(HITSZ_CDR):用于生物创意竞赛V的端到端系统
Database (Oxford). 2016 Jun 5;2016. doi: 10.1093/database/baw077. Print 2016.
2
BioCreative V CDR task corpus: a resource for chemical disease relation extraction.生物创意V化学疾病关系提取任务语料库:化学疾病关系提取的资源。
Database (Oxford). 2016 May 9;2016. doi: 10.1093/database/baw068. Print 2016.
3
NCBI disease corpus: a resource for disease name recognition and concept normalization.
新冠病毒风险检测的临床应用:一种自然语言处理方法。
Viruses. 2022 Dec 11;14(12):2761. doi: 10.3390/v14122761.
4
Exploring deep learning methods for recognizing rare diseases and their clinical manifestations from texts.探索深度学习方法,从文本中识别罕见病及其临床表现。
BMC Bioinformatics. 2022 Jul 6;23(1):263. doi: 10.1186/s12859-022-04810-y.
5
Named entity recognition on bio-medical literature documents using hybrid based approach.使用基于混合方法的生物医学文献文档命名实体识别。
J Ambient Intell Humaniz Comput. 2021 Mar 11:1-10. doi: 10.1007/s12652-021-03078-z.
6
Recent advances in biomedical literature mining.生物医学文献挖掘的最新进展。
Brief Bioinform. 2021 May 20;22(3). doi: 10.1093/bib/bbaa057.
7
Biomedical named entity recognition using deep neural networks with contextual information.基于上下文信息的深度神经网络的生物医学命名实体识别。
BMC Bioinformatics. 2019 Dec 27;20(1):735. doi: 10.1186/s12859-019-3321-4.
8
Ontology-Based Healthcare Named Entity Recognition from Twitter Messages Using a Recurrent Neural Network Approach.基于本体的推特消息中医疗命名实体识别的递归神经网络方法。
Int J Environ Res Public Health. 2019 Sep 27;16(19):3628. doi: 10.3390/ijerph16193628.
9
A Domain Knowledge-Enhanced LSTM-CRF Model for Disease Named Entity Recognition.一种用于疾病命名实体识别的领域知识增强型长短期记忆网络-条件随机场模型。
AMIA Jt Summits Transl Sci Proc. 2019 May 6;2019:761-770. eCollection 2019.
10
BEERE: a web server for biomedical entity expansion, ranking and explorations.BEERE:一个用于生物医学实体扩展、排名和探索的网络服务器。
Nucleic Acids Res. 2019 Jul 2;47(W1):W578-W586. doi: 10.1093/nar/gkz428.
NCBI疾病语料库:一种用于疾病名称识别和概念规范化的资源。
J Biomed Inform. 2014 Feb;47:1-10. doi: 10.1016/j.jbi.2013.12.006. Epub 2014 Jan 3.
4
MEDIC: a practical disease vocabulary used at the Comparative Toxicogenomics Database.医学:比较毒理学基因组学数据库中使用的实用疾病词汇。
Database (Oxford). 2012 Mar 20;2012:bar065. doi: 10.1093/database/bar065. Print 2012.
5
A new face and new challenges for Online Mendelian Inheritance in Man (OMIM®).在线孟德尔遗传数据库(OMIM®)迎来新面貌与新挑战。
Hum Mutat. 2011 May;32(5):564-7. doi: 10.1002/humu.21466. Epub 2011 Apr 5.
6
Medical subject headings used to search the biomedical literature.用于检索生物医学文献的医学主题词。
J Am Med Inform Assoc. 2001 Jul-Aug;8(4):317-23. doi: 10.1136/jamia.2001.0080317.
7
Long short-term memory.长短期记忆
Neural Comput. 1997 Nov 15;9(8):1735-80. doi: 10.1162/neco.1997.9.8.1735.