临床关系抽取的多特征：一种机器学习方法。

Multiple features for clinical relation extraction: A machine learning approach.

机构信息

Kazan Federal University, 18 Kremlyovskaya Street, Kazan 420008, Russian Federation.

Kazan Federal University, 18 Kremlyovskaya Street, Kazan 420008, Russian Federation; St. Petersburg Department of the Steklov Mathematical Institute, 27 Fontanka, St. Petersburg 191023, Russian Federation; Insilico Medicine Hong Kong Ltd, Pak Shek Kok, New Territories, Hong Kong.

出版信息

J Biomed Inform. 2020 Mar;103:103382. doi: 10.1016/j.jbi.2020.103382. Epub 2020 Feb 3.

DOI:10.1016/j.jbi.2020.103382

PMID:32028051

Abstract

Relation extraction aims to discover relational facts about entity mentions from plain texts. In this work, we focus on clinical relation extraction; namely, given a medical record with mentions of drugs and their attributes, we identify relations between these entities. We propose a machine learning model with a novel set of knowledge-based and BioSentVec embedding features. We systematically investigate the impact of these features with standard distance- and word-based features, conducting experiments on two benchmark datasets of clinical texts from MADE 2018 and n2c2 2018 shared tasks. For comparison with the feature-based model, we utilize state-of-the-art models and three BERT-based models, including BioBERT and Clinical BERT. Our results demonstrate that distance and word features provide significant benefits to the classifier. Knowledge-based features improve classification results only for particular types of relations. The sentence embedding feature provides the largest improvement in results, among other explored features on the MADE corpus. The classifier obtains state-of-the-art performance in clinical relation extraction with F-measure of 92.6%, improving F-measure by 3.5% on the MADE corpus.

摘要

关系抽取旨在从纯文本中发现关于实体提及的关系事实。在这项工作中，我们专注于临床关系抽取；即，给定一个包含药物及其属性提及的病历，我们确定这些实体之间的关系。我们提出了一个具有新颖的基于知识和 BioSentVec 嵌入特征的机器学习模型。我们系统地研究了这些特征与标准距离和基于单词的特征的影响，在 MADE 2018 和 n2c2 2018 共享任务的两个临床文本基准数据集上进行了实验。为了与基于特征的模型进行比较，我们利用了最先进的模型和三个基于 BERT 的模型，包括 BioBERT 和 Clinical BERT。我们的结果表明，距离和单词特征对分类器有显著的帮助。基于知识的特征仅对特定类型的关系提高分类结果。在 MADE 语料库上，句子嵌入特征在其他探索的特征中提供了最大的结果改进。该分类器在临床关系抽取中获得了最先进的性能，在 MADE 语料库上的 F1 得分为 92.6%，提高了 3.5%。

相似文献

Multiple features for clinical relation extraction: A machine learning approach.

J Biomed Inform. 2020 Mar;103:103382. doi: 10.1016/j.jbi.2020.103382. Epub 2020 Feb 3.

A Hybrid Model for Family History Information Identification and Relation Extraction: Development and Evaluation of an End-to-End Information Extraction System.

JMIR Med Inform. 2021 Apr 22;9(4):e22797. doi: 10.2196/22797.

Drug knowledge discovery via multi-task learning and pre-trained models.

BMC Med Inform Decis Mak. 2021 Nov 16;21(Suppl 9):251. doi: 10.1186/s12911-021-01614-7.

Extracting comprehensive clinical information for breast cancer using deep learning methods.

Int J Med Inform. 2019 Dec;132:103985. doi: 10.1016/j.ijmedinf.2019.103985. Epub 2019 Oct 2.

An adverse drug effect mentions extraction method based on weighted online recurrent extreme learning machine.

Comput Methods Programs Biomed. 2019 Jul;176:33-41. doi: 10.1016/j.cmpb.2019.04.029. Epub 2019 Apr 30.

Relation Classification for Bleeding Events From Electronic Health Records Using Deep Learning Systems: An Empirical Study.

JMIR Med Inform. 2021 Jul 2;9(7):e27527. doi: 10.2196/27527.

LBERT: Lexically aware Transformer-based Bidirectional Encoder Representation model for learning universal bio-entity relations.

Bioinformatics. 2021 Apr 20;37(3):404-412. doi: 10.1093/bioinformatics/btaa721.

Methods Mol Biol. 2022;2496:221-235. doi: 10.1007/978-1-0716-2305-3_12.

An attentive joint model with transformer-based weighted graph convolutional network for extracting adverse drug event relation.

J Biomed Inform. 2022 Jan;125:103968. doi: 10.1016/j.jbi.2021.103968. Epub 2021 Dec 4.

Entity recognition from clinical texts via recurrent neural network.

BMC Med Inform Decis Mak. 2017 Jul 5;17(Suppl 2):67. doi: 10.1186/s12911-017-0468-7.

引用本文的文献

ERNIE-UIE: Advancing information extraction in Chinese medical knowledge graph.

PLoS One. 2025 May 29;20(5):e0325082. doi: 10.1371/journal.pone.0325082. eCollection 2025.

Enhancing Relation Extraction for COVID-19 Vaccine Shot-Adverse Event Associations with Large Language Models.

Res Sq. 2025 Mar 17:rs.3.rs-6201919. doi: 10.21203/rs.3.rs-6201919/v1.

A framework for integrating biomedical knowledge in Wikidata with open biological and biomedical ontologies and MeSH keywords.

Heliyon. 2024 Sep 27;10(19):e38448. doi: 10.1016/j.heliyon.2024.e38448. eCollection 2024 Oct 15.

Large Language Models and Genomics for Summarizing the Role of microRNA in Regulating mRNA Expression.

Biomedicines. 2024 Jul 10;12(7):1535. doi: 10.3390/biomedicines12071535.

Transformers and large language models in healthcare: A review.

Artif Intell Med. 2024 Aug;154:102900. doi: 10.1016/j.artmed.2024.102900. Epub 2024 Jun 5.

Automatic extraction of ranked SNP-phenotype associations from text using a BERT-LSTM-based method.

BMC Bioinformatics. 2023 Apr 12;24(1):144. doi: 10.1186/s12859-023-05236-w.

A hybrid algorithm for clinical decision support in precision medicine based on machine learning.

BMC Bioinformatics. 2023 Jan 3;24(1):3. doi: 10.1186/s12859-022-05116-9.

A large language model for electronic health records.

NPJ Digit Med. 2022 Dec 26;5(1):194. doi: 10.1038/s41746-022-00742-2.

Identification and Impact Analysis of Family History of Psychiatric Disorder in Mood Disorder Patients With Pretrained Language Model.

Front Psychiatry. 2022 May 20;13:861930. doi: 10.3389/fpsyt.2022.861930. eCollection 2022.

Extracting Adverse Drug Events from Clinical Notes.

AMIA Jt Summits Transl Sci Proc. 2021 May 17;2021:420-429. eCollection 2021.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

临床关系抽取的多特征：一种机器学习方法。

Multiple features for clinical relation extraction: A machine learning approach.

机构信息

Kazan Federal University, 18 Kremlyovskaya Street, Kazan 420008, Russian Federation.

出版信息

J Biomed Inform. 2020 Mar;103:103382. doi: 10.1016/j.jbi.2020.103382. Epub 2020 Feb 3.

DOI:10.1016/j.jbi.2020.103382

PMID:32028051

Abstract

摘要

临床关系抽取的多特征：一种机器学习方法。

Multiple features for clinical relation extraction: A machine learning approach.

机构信息

出版信息

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

临床关系抽取的多特征：一种机器学习方法。

Multiple features for clinical relation extraction: A machine learning approach.

机构信息

出版信息

相似文献

引用本文的文献