利用序列标注框架从生物医学文本中提取文档级关系。

Exploiting sequence labeling framework to extract document-level relations from biomedical texts.

机构信息

School of Computer Science and Technology, Dalian University of Technology, Dalian, 116024, China.

School of Biomedical Informatics, University of Texas Health Science Center at Houston, Houston, 77030, USA.

出版信息

BMC Bioinformatics. 2020 Mar 27;21(1):125. doi: 10.1186/s12859-020-3457-2.

DOI:10.1186/s12859-020-3457-2

PMID:32216746

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7099809/

Abstract

BACKGROUND

Both intra- and inter-sentential semantic relations in biomedical texts provide valuable information for biomedical research. However, most existing methods either focus on extracting intra-sentential relations and ignore inter-sentential ones or fail to extract inter-sentential relations accurately and regard the instances containing entity relations as being independent, which neglects the interactions between relations. We propose a novel sequence labeling-based biomedical relation extraction method named Bio-Seq. In the method, sequence labeling framework is extended by multiple specified feature extractors so as to facilitate the feature extractions at different levels, especially at the inter-sentential level. Besides, the sequence labeling framework enables Bio-Seq to take advantage of the interactions between relations, and thus, further improves the precision of document-level relation extraction.

RESULTS

Our proposed method obtained an F1-score of 63.5% on BioCreative V chemical disease relation corpus, and an F1-score of 54.4% on inter-sentential relations, which was 10.5% better than the document-level classification baseline. Also, our method achieved an F1-score of 85.1% on n2c2-ADE sub-dataset.

CONCLUSION

Sequence labeling method can be successfully used to extract document-level relations, especially for boosting the performance on inter-sentential relation extraction. Our work can facilitate the research on document-level biomedical text mining.

摘要

背景

生物医学文本中的句内和句间语义关系都为生物医学研究提供了有价值的信息。然而，大多数现有的方法要么专注于提取句内关系而忽略句间关系，要么无法准确提取句间关系，并将包含实体关系的实例视为独立的，从而忽略了关系之间的相互作用。我们提出了一种名为 Bio-Seq 的基于序列标注的生物医学关系抽取新方法。在该方法中，通过多个指定的特征提取器扩展了序列标注框架，以便于在不同层次（特别是句间层次）进行特征提取。此外，序列标注框架使 Bio-Seq 能够利用关系之间的相互作用，从而进一步提高文档级关系抽取的精度。

结果

我们提出的方法在 BioCreative V 化学疾病关系语料库上的 F1 得分为 63.5%，在句间关系上的 F1 得分为 54.4%，比文档级分类基线高 10.5%。此外，我们的方法在 n2c2-ADE 子数据集上的 F1 得分为 85.1%。

结论

序列标注方法可成功用于提取文档级关系，特别是可提高句间关系提取的性能。我们的工作可以促进文档级生物医学文本挖掘的研究。

相似文献

Exploiting sequence labeling framework to extract document-level relations from biomedical texts.利用序列标注框架从生物医学文本中提取文档级关系。

BMC Bioinformatics. 2020 Mar 27;21(1):125. doi: 10.1186/s12859-020-3457-2.

DocR-BERT: Document-Level R-BERT for Chemical-Induced Disease Relation Extraction via Gaussian Probability Distribution.基于高斯概率分布的文档级 R-BERT 在化学诱导疾病关系抽取中的应用

IEEE J Biomed Health Inform. 2022 Mar;26(3):1341-1352. doi: 10.1109/JBHI.2021.3116769. Epub 2022 Mar 7.

An effective neural model extracting document level chemical-induced disease relations from biomedical literature.从生物医学文献中提取文档级化学诱导疾病关系的有效神经网络模型。

J Biomed Inform. 2018 Jul;83:1-9. doi: 10.1016/j.jbi.2018.05.001. Epub 2018 May 8.

Extracting Inter-Sentence Relations for Associating Biological Context with Events in Biomedical Texts.提取句间关系，将生物背景与生物医学文本中的事件关联起来。

IEEE/ACM Trans Comput Biol Bioinform. 2020 Nov-Dec;17(6):1895-1906. doi: 10.1109/TCBB.2019.2904231. Epub 2020 Dec 8.

Biomedical relation extraction via knowledge-enhanced reading comprehension.基于知识增强的阅读理解的生物医学关系抽取。

BMC Bioinformatics. 2022 Jan 6;23(1):20. doi: 10.1186/s12859-021-04534-5.

Enhancing Biomedical Relation Extraction with Transformer Models using Shortest Dependency Path Features and Triplet Information.利用最短依赖路径特征和三元组信息增强基于 Transformer 的生物医学关系抽取

J Biomed Inform. 2021 Oct;122:103893. doi: 10.1016/j.jbi.2021.103893. Epub 2021 Sep 2.

Detecting causality from online psychiatric texts using inter-sentential language patterns.使用句子间语言模式从在线精神科文本中检测因果关系。

BMC Med Inform Decis Mak. 2012 Jul 18;12:72. doi: 10.1186/1472-6947-12-72.

Enhanced Heterogeneous Graph Attention Network with a Novel Multilabel Focal Loss for Document-Level Relation Extraction.具有新型多标签焦点损失的增强异构图注意力网络用于文档级关系抽取

Entropy (Basel). 2024 Feb 28;26(3):210. doi: 10.3390/e26030210.

Hierarchical sequence labeling for extracting BEL statements from biomedical literature.从生物医学文献中提取 BEL 语句的层次化序列标注。

BMC Med Inform Decis Mak. 2019 Apr 9;19(Suppl 2):63. doi: 10.1186/s12911-019-0758-3.

DTranNER: biomedical named entity recognition with deep learning-based label-label transition model.DTranNER：基于深度学习的标签-标签转换模型的生物医学命名实体识别。

BMC Bioinformatics. 2020 Feb 11;21(1):53. doi: 10.1186/s12859-020-3393-1.

引用本文的文献

Exploiting question-answer framework with multi-GRU to detect adverse drug reaction on social media.利用带有多门控循环单元的问答框架来检测社交媒体上的药物不良反应。

Sci Rep. 2025 Feb 4;15(1):4157. doi: 10.1038/s41598-025-87724-y.

Assigning species information to corresponding genes by a sequence labeling framework.通过序列标注框架为相应的基因分配物种信息。

Database (Oxford). 2022 Oct 13;2022. doi: 10.1093/database/baac090.

A sequence labeling framework for extracting drug-protein relations from biomedical literature.一种从生物医学文献中提取药物-蛋白质关系的序列标注框架。

Database (Oxford). 2022 Jul 19;2022. doi: 10.1093/database/baac058.

Biomedical relation extraction via knowledge-enhanced reading comprehension.基于知识增强的阅读理解的生物医学关系抽取。

BMC Bioinformatics. 2022 Jan 6;23(1):20. doi: 10.1186/s12859-021-04534-5.

本文引用的文献

Chemical-induced disease extraction via recurrent piecewise convolutional neural networks.基于循环分段卷积神经网络的化学诱导疾病提取。

BMC Med Inform Decis Mak. 2018 Jul 23;18(Suppl 2):60. doi: 10.1186/s12911-018-0629-3.

J Biomed Inform. 2018 Jul;83:1-9. doi: 10.1016/j.jbi.2018.05.001. Epub 2018 May 8.

Chemical-induced disease relation extraction via convolutional neural network.通过卷积神经网络进行化学诱导疾病关系提取

Database (Oxford). 2017 Jan 1;2017(1). doi: 10.1093/database/bax024.

Drug drug interaction extraction from biomedical literature using syntax convolutional neural network.使用句法卷积神经网络从生物医学文献中提取药物相互作用

Bioinformatics. 2016 Nov 15;32(22):3444-3453. doi: 10.1093/bioinformatics/btw486. Epub 2016 Jul 27.

MIMIC-III, a freely accessible critical care database.MIMIC-III，一个免费获取的重症监护数据库。

Sci Data. 2016 May 24;3:160035. doi: 10.1038/sdata.2016.35.

Exploiting syntactic and semantics information for chemical-disease relation extraction.利用句法和语义信息进行化学-疾病关系提取。

Database (Oxford). 2016 Apr 14;2016. doi: 10.1093/database/baw048. Print 2016.

Chemical-induced disease relation extraction with various linguistic features.基于多种语言特征的化学诱导疾病关系提取

Database (Oxford). 2016 Apr 6;2016. doi: 10.1093/database/baw042. Print 2016.

CD-REST: a system for extracting chemical-induced disease relation in literature.CD-REST：一种用于从文献中提取化学物质诱发疾病关系的系统。

Database (Oxford). 2016 Mar 25;2016. doi: 10.1093/database/baw036. Print 2016.

Deep learning.深度学习。

Nature. 2015 May 28;521(7553):436-44. doi: 10.1038/nature14539.

Knowledge-based extraction of adverse drug events from biomedical text.基于知识的生物医学文本中不良药物事件的提取。

BMC Bioinformatics. 2014 Mar 4;15:64. doi: 10.1186/1471-2105-15-64.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

利用序列标注框架从生物医学文本中提取文档级关系。

Exploiting sequence labeling framework to extract document-level relations from biomedical texts.

机构信息

出版信息

BACKGROUND

RESULTS

CONCLUSION

背景

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献