C-Norm：一种基于神经网络的少样本实体归一化方法。

C-Norm: a neural approach to few-shot entity normalization.

机构信息

Université Paris-Saclay, INRAE, MaIAGE, Jouy-en-Josas, France.

Université Paris-Saclay, CNRS, LIMSI, Orsay, France.

出版信息

BMC Bioinformatics. 2020 Dec 29;21(Suppl 23):579. doi: 10.1186/s12859-020-03886-8.

DOI:10.1186/s12859-020-03886-8

PMID:33372606

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7771092/

Abstract

BACKGROUND

Entity normalization is an important information extraction task which has gained renewed attention in the last decade, particularly in the biomedical and life science domains. In these domains, and more generally in all specialized domains, this task is still challenging for the latest machine learning-based approaches, which have difficulty handling highly multi-class and few-shot learning problems. To address this issue, we propose C-Norm, a new neural approach which synergistically combines standard and weak supervision, ontological knowledge integration and distributional semantics.

RESULTS

Our approach greatly outperforms all methods evaluated on the Bacteria Biotope datasets of BioNLP Open Shared Tasks 2019, without integrating any manually-designed domain-specific rules.

CONCLUSIONS

Our results show that relatively shallow neural network methods can perform well in domains that present highly multi-class and few-shot learning problems.

摘要

背景

实体规范化是一项重要的信息提取任务，在过去十年中受到了新的关注，特别是在生物医学和生命科学领域。在这些领域，以及更普遍的所有专业领域，这个任务对于基于最新机器学习的方法来说仍然具有挑战性，因为它们难以处理高度多类和少数样本学习问题。为了解决这个问题，我们提出了 C-Norm，这是一种新的神经方法，它协同结合了标准和弱监督、本体知识集成和分布语义。

结果

我们的方法在 2019 年生物自然语言处理开放共享任务的细菌生境数据集上的所有评估方法中表现出色，没有集成任何手动设计的领域特定规则。

结论

我们的结果表明，相对较浅的神经网络方法可以在呈现高度多类和少数样本学习问题的领域中表现良好。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4d8b/7771092/3ad893913a35/12859_2020_3886_Fig1_HTML.jpg

相似文献

C-Norm: a neural approach to few-shot entity normalization.

BMC Bioinformatics. 2020 Dec 29;21(Suppl 23):579. doi: 10.1186/s12859-020-03886-8.

Linking entities through an ontology using word embeddings and syntactic re-ranking.

BMC Bioinformatics. 2019 Mar 27;20(1):156. doi: 10.1186/s12859-019-2678-8.

Detection and categorization of bacteria habitats using shallow linguistic analysis.

BMC Bioinformatics. 2015;16 Suppl 10(Suppl 10):S5. doi: 10.1186/1471-2105-16-S10-S5. Epub 2015 Jul 13.

CNN-based ranking for biomedical entity normalization.

BMC Bioinformatics. 2017 Oct 3;18(Suppl 11):385. doi: 10.1186/s12859-017-1805-7.

A transfer learning model with multi-source domains for biomedical event trigger extraction.

BMC Genomics. 2021 Jan 7;22(1):31. doi: 10.1186/s12864-020-07315-1.

PASCAL: a pseudo cascade learning framework for breast cancer treatment entity normalization in Chinese clinical text.

BMC Med Inform Decis Mak. 2020 Aug 28;20(1):204. doi: 10.1186/s12911-020-01216-9.

Multitask learning for biomedical named entity recognition with cross-sharing structure.

BMC Bioinformatics. 2019 Aug 16;20(1):427. doi: 10.1186/s12859-019-3000-5.

Attributes learning network for generalized zero-shot learning.

Neural Netw. 2022 Jun;150:112-118. doi: 10.1016/j.neunet.2022.02.018. Epub 2022 Mar 5.

Meta-Transfer Learning Through Hard Tasks.

IEEE Trans Pattern Anal Mach Intell. 2022 Mar;44(3):1443-1456. doi: 10.1109/TPAMI.2020.3018506. Epub 2022 Feb 3.

Dual-Channel Prototype Network for Few-Shot Pathology Image Classification.

IEEE J Biomed Health Inform. 2024 Jul;28(7):4132-4144. doi: 10.1109/JBHI.2024.3386197. Epub 2024 Jul 2.

引用本文的文献

TaeC: A manually annotated text dataset for trait and phenotype extraction and entity linking in wheat breeding literature.

PLoS One. 2024 Jun 13;19(6):e0305475. doi: 10.1371/journal.pone.0305475. eCollection 2024.

MilkOligoThesaurus, a dataset of mammalian milk oligosaccharide synonyms.

Data Brief. 2024 Apr 9;54:110404. doi: 10.1016/j.dib.2024.110404. eCollection 2024 Jun.

Few-shot learning for medical text: A review of advances, trends, and opportunities.

J Biomed Inform. 2023 Aug;144:104458. doi: 10.1016/j.jbi.2023.104458. Epub 2023 Jul 23.

An analysis of entity normalization evaluation biases in specialized domains.

BMC Bioinformatics. 2023 Jun 2;24(1):227. doi: 10.1186/s12859-023-05350-9.

本文引用的文献

Improving the CONTES method for normalizing biomedical text entities with concepts from an ontology with (almost) no training data.

Genomics Inform. 2019 Jun;17(2):e20. doi: 10.5808/GI.2019.17.2.e20. Epub 2019 Jun 27.

Linking entities through an ontology using word embeddings and syntactic re-ranking.

BMC Bioinformatics. 2019 Mar 27;20(1):156. doi: 10.1186/s12859-019-2678-8.

Data and systems for medication-related text classification and concept normalization from Twitter: insights from the Social Media Mining for Health (SMM4H)-2017 shared task.

J Am Med Inform Assoc. 2018 Oct 1;25(10):1274-1283. doi: 10.1093/jamia/ocy114.

DNorm: disease name normalization with pairwise learning to rank.

Bioinformatics. 2013 Nov 15;29(22):2909-17. doi: 10.1093/bioinformatics/btt474. Epub 2013 Aug 21.

LINNAEUS: a species name identification system for biomedical literature.

BMC Bioinformatics. 2010 Feb 11;11:85. doi: 10.1186/1471-2105-11-85.

The graph neural network model.

IEEE Trans Neural Netw. 2009 Jan;20(1):61-80. doi: 10.1109/TNN.2008.2005605. Epub 2008 Dec 9.

Learning string similarity measures for gene/protein name dictionary look-up using logistic regression.

Bioinformatics. 2007 Oct 15;23(20):2768-74. doi: 10.1093/bioinformatics/btm393. Epub 2007 Aug 12.

A new method to measure the semantic similarity of GO terms.

Bioinformatics. 2007 May 15;23(10):1274-81. doi: 10.1093/bioinformatics/btm087. Epub 2007 Mar 7.

ProMiner: rule-based protein and gene entity recognition.

BMC Bioinformatics. 2005;6 Suppl 1(Suppl 1):S14. doi: 10.1186/1471-2105-6-S1-S14. Epub 2005 May 24.

Effective mapping of biomedical text to the UMLS Metathesaurus: the MetaMap program.

Proc AMIA Symp. 2001:17-21.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

C-Norm：一种基于神经网络的少样本实体归一化方法。

C-Norm: a neural approach to few-shot entity normalization.

机构信息

出版信息

BACKGROUND

RESULTS

CONCLUSIONS

背景

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献