• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

医学领域中的实体关系提取:基于数据增强

Entity relation extraction in the medical domain: based on data augmentation.

作者信息

Wang Anli, Li Linyi, Wu Xuehong, Zhu Jianping, Yu Shanshan, Chen Xi, Li Jianhua, Zhu Hongtao

机构信息

Information Center, The Third Xiangya Hospital, Central South University, Changsha, China.

School of Computer Science, Central South University, Changsha, China.

出版信息

Ann Transl Med. 2022 Oct;10(19):1061. doi: 10.21037/atm-22-3991.

DOI:10.21037/atm-22-3991
PMID:36330405
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9622485/
Abstract

BACKGROUND

Entity relation extraction is an important task in the construction of professional knowledge graphs in the medical field. Research on entity relation extraction for academic books in the medical field has revealed that there is a great difference in the number of different entity relations, which has led to the formation of a typical unbalanced data set that is difficult to recognize but has certain research value.

METHODS

In this article, we propose a new entity relation extraction method based on data augmentation. According to the distribution of individual entity relation classes in the data set, the probability of whether a text is augmented during training was calculated. In text-oriented data augmentation, different augmentation methods perform differently in different language environments. The reinforcement of learning determines which data augmentation method to use in the current language environment. This strategy was applied to the entity relation extraction of the medical professional book, , and different data augmentation methods (i.e., no data augmentation, traditional data augmentation, and reinforcement learning-based data augmentation) were compared under the same neural network model.

RESULTS

The deep-learning model using data augmentation was better than the model without data augmentation, as data augmentation significantly improved the evaluation indicators of the relation classes with low data volumes in the unbalanced data set and slightly improved the evaluation indicators of the relation classes with sufficient features and large data volumes. Additionally, the deep-learning model using reinforcement learning-based data augmentation was superior to the deep-learning model using traditional data augmentation. We found that after the application of reinforcement learning-based data augmentation, the evaluation indicators of the multiple relation classes were much better than those to which reinforcement learning-based data augmentation had not been applied.

CONCLUSIONS

For unbalanced data sets, data augmentation can effectively improve the ability of the deep-learning model to obtain data features, and reinforcement learning-based data augmentation can further enhance this ability. Our experiments confirmed the superiority of reinforcement learning-based data augmentation.

摘要

背景

实体关系抽取是医学领域专业知识图谱构建中的一项重要任务。对医学领域学术书籍的实体关系抽取研究表明,不同实体关系的数量存在很大差异,这导致形成了一个典型的不平衡数据集,该数据集难以识别,但具有一定的研究价值。

方法

在本文中,我们提出了一种基于数据增强的新实体关系抽取方法。根据数据集中单个实体关系类别的分布,计算文本在训练期间是否增强的概率。在面向文本的数据增强中,不同的增强方法在不同的语言环境中表现不同。强化学习决定在当前语言环境中使用哪种数据增强方法。该策略应用于医学专业书籍的实体关系抽取,并在相同的神经网络模型下比较了不同的数据增强方法(即无数据增强、传统数据增强和基于强化学习的数据增强)。

结果

使用数据增强的深度学习模型优于未使用数据增强的模型,因为数据增强显著提高了不平衡数据集中数据量较少的关系类别的评估指标,并略微提高了具有足够特征和大量数据的关系类别的评估指标。此外,使用基于强化学习的数据增强的深度学习模型优于使用传统数据增强的深度学习模型。我们发现,应用基于强化学习的数据增强后,多个关系类别的评估指标比未应用基于强化学习的数据增强的情况要好得多。

结论

对于不平衡数据集,数据增强可以有效提高深度学习模型获取数据特征的能力,基于强化学习的数据增强可以进一步增强这种能力。我们的实验证实了基于强化学习的数据增强的优越性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2165/9622485/391bf8d71453/atm-10-19-1061-f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2165/9622485/80e242e94aaa/atm-10-19-1061-f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2165/9622485/e0d3462d495b/atm-10-19-1061-f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2165/9622485/76ca8398e5fb/atm-10-19-1061-f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2165/9622485/391bf8d71453/atm-10-19-1061-f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2165/9622485/80e242e94aaa/atm-10-19-1061-f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2165/9622485/e0d3462d495b/atm-10-19-1061-f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2165/9622485/76ca8398e5fb/atm-10-19-1061-f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2165/9622485/391bf8d71453/atm-10-19-1061-f4.jpg

相似文献

1
Entity relation extraction in the medical domain: based on data augmentation.医学领域中的实体关系提取:基于数据增强
Ann Transl Med. 2022 Oct;10(19):1061. doi: 10.21037/atm-22-3991.
2
Extraction of entity relations from Chinese medical literature based on multi-scale CRNN.基于多尺度CRNN的中文医学文献实体关系提取
Ann Transl Med. 2022 May;10(9):520. doi: 10.21037/atm-22-1226.
3
Adaptive class augmented prototype network for few-shot relation extraction.用于少样本关系抽取的自适应类别增强原型网络。
Neural Netw. 2024 Jan;169:134-142. doi: 10.1016/j.neunet.2023.10.025. Epub 2023 Oct 19.
4
A Hybrid Model for Family History Information Identification and Relation Extraction: Development and Evaluation of an End-to-End Information Extraction System.一种用于家族病史信息识别与关系抽取的混合模型:一个端到端信息抽取系统的开发与评估
JMIR Med Inform. 2021 Apr 22;9(4):e22797. doi: 10.2196/22797.
5
Extracting entities with attributes in clinical text via joint deep learning.通过联合深度学习从临床文本中提取具有属性的实体。
J Am Med Inform Assoc. 2019 Dec 1;26(12):1584-1591. doi: 10.1093/jamia/ocz158.
6
C2RL: Convolutional-Contrastive Learning for Reinforcement Learning Based on Self-Pretraining for Strong Augmentation.C2RL:基于自预训练的强化学习的卷积对比学习,用于强增强。
Sensors (Basel). 2023 May 21;23(10):4946. doi: 10.3390/s23104946.
7
PLRTE: Progressive learning for biomedical relation triplet extraction using large language models.基于大语言模型的生物医学关系三元组抽取的渐进式学习方法(PLRTE)。
J Biomed Inform. 2024 Nov;159:104738. doi: 10.1016/j.jbi.2024.104738. Epub 2024 Oct 18.
8
Joint Extraction of Entities and Relations Using Reinforcement Learning and Deep Learning.使用强化学习和深度学习联合提取实体与关系
Comput Intell Neurosci. 2017;2017:7643065. doi: 10.1155/2017/7643065. Epub 2017 Aug 14.
9
Biomedical event causal relation extraction with deep knowledge fusion and Roberta-based data augmentation.基于深度知识融合和 Roberta 增强的数据的生物医学事件因果关系抽取。
Methods. 2024 Nov;231:8-14. doi: 10.1016/j.ymeth.2024.08.007. Epub 2024 Sep 4.
10
Entity relation extraction from electronic medical records based on improved annotation rules and BiLSTM-CRF.基于改进标注规则和双向长短期记忆网络-条件随机场的电子病历实体关系抽取
Ann Transl Med. 2021 Sep;9(18):1415. doi: 10.21037/atm-21-3828.

引用本文的文献

1
A review of reinforcement learning for natural language processing and applications in healthcare.强化学习在自然语言处理中的综述及在医疗保健中的应用。
J Am Med Inform Assoc. 2024 Oct 1;31(10):2379-2393. doi: 10.1093/jamia/ocae215.
2
An improved data augmentation approach and its application in medical named entity recognition.一种改进的数据增强方法及其在医学命名实体识别中的应用。
BMC Med Inform Decis Mak. 2024 Aug 5;24(1):221. doi: 10.1186/s12911-024-02624-x.

本文引用的文献

1
Extraction of entity relations from Chinese medical literature based on multi-scale CRNN.基于多尺度CRNN的中文医学文献实体关系提取
Ann Transl Med. 2022 May;10(9):520. doi: 10.21037/atm-22-1226.
2
A survey of word embeddings for clinical text.临床文本词嵌入研究
J Biomed Inform. 2019;100S:100057. doi: 10.1016/j.yjbinx.2019.100057. Epub 2019 Oct 28.
3
Automatic Extraction of Lung Cancer Staging Information From Computed Tomography Reports: Deep Learning Approach.从计算机断层扫描报告中自动提取肺癌分期信息:深度学习方法
JMIR Med Inform. 2021 Jul 21;9(7):e27955. doi: 10.2196/27955.
4
Causal relationship extraction from biomedical text using deep neural models: A comprehensive survey.基于深度神经网络模型的生物医学文本因果关系抽取:全面综述。
J Biomed Inform. 2021 Jul;119:103820. doi: 10.1016/j.jbi.2021.103820. Epub 2021 May 24.
5
Convolutional Neural Network-Based Artificial Intelligence for Classification of Protein Localization Patterns.基于卷积神经网络的人工智能在蛋白质定位模式分类中的应用。
Biomolecules. 2021 Feb 11;11(2):264. doi: 10.3390/biom11020264.
6
Identifying relations of medications with adverse drug events using recurrent convolutional neural networks and gradient boosting.利用递归卷积神经网络和梯度提升来识别药物与药物不良事件之间的关系。
J Am Med Inform Assoc. 2020 Jan 1;27(1):65-72. doi: 10.1093/jamia/ocz144.
7
Relation path feature embedding based convolutional neural network method for drug discovery.基于关联路径特征嵌入的卷积神经网络药物发现方法。
BMC Med Inform Decis Mak. 2019 Apr 9;19(Suppl 2):59. doi: 10.1186/s12911-019-0764-5.
8
Extracting chemical-protein relations with ensembles of SVM and deep learning models.基于 SVM 和深度学习模型集成提取化学-蛋白质关系。
Database (Oxford). 2018 Jan 1;2018. doi: 10.1093/database/bay073.
9
A rule-based named-entity recognition method for knowledge extraction of evidence-based dietary recommendations.一种基于规则的命名实体识别方法,用于循证饮食建议的知识提取。
PLoS One. 2017 Jun 23;12(6):e0179488. doi: 10.1371/journal.pone.0179488. eCollection 2017.
10
Rationale-Augmented Convolutional Neural Networks for Text Classification.用于文本分类的基于原理增强的卷积神经网络。
Proc Conf Empir Methods Nat Lang Process. 2016 Nov;2016:795-804. doi: 10.18653/v1/d16-1076.