BELMiner：调整基于规则的关系提取系统，以从生物医学文献证据句子中提取生物表达语言陈述。

BELMiner: adapting a rule-based relation extraction system to extract biological expression language statements from bio-medical literature evidence sentences.

作者信息

Ravikumar K E, Rastegar-Mojarad Majid, Liu Hongfang

机构信息

Department of Health Sciences Research, Mayo Clinic, USA and.

Department of Health Informatics and Administration, University of Wisconsin-Milwaukee, Milwaukee, WI, USA.

出版信息

Database (Oxford). 2017 Jan 1;2017(1). doi: 10.1093/database/baw156.

DOI:10.1093/database/baw156

PMID:28365720

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5467463/

Abstract

UNLABELLED

Extracting meaningful relationships with semantic significance from biomedical literature is often a challenging task. BioCreative V track4 challenge for the first time has organized a comprehensive shared task to test the robustness of the text-mining algorithms in extracting semantically meaningful assertions from the evidence statement in biomedical text. In this work, we tested the ability of a rule-based semantic parser to extract Biological Expression Language (BEL) statements from evidence sentences culled out of biomedical literature as part of BioCreative V Track4 challenge. The system achieved an overall best F-measure of 21.29% in extracting the complete BEL statement. For relation extraction, the system achieved an F-measure of 65.13% on test data set. Our system achieved the best performance in five of the six criteria that was adopted for evaluation by the task organizers. Lack of ability to derive semantic inferences, limitation in the rule sets to map the textual extractions to BEL function were some of the reasons for low performance in extracting the complete BEL statement. Post shared task we also evaluated the impact of differential NER components on the ability to extract BEL statements on the test data sets besides making a single change in the rule sets that translate relation extractions into a BEL statement. There is a marked improvement by over 20% in the overall performance of the BELMiner's capability to extract BEL statement on the test set. The system is available as a REST-API at http://54.146.11.205:8484/BELXtractor/finder/.

DATABASE URL

http://54.146.11.205:8484/BELXtractor/finder/.

摘要

未标注

从生物医学文献中提取具有语义意义的有意义关系往往是一项具有挑战性的任务。生物创意V挑战赛的第4赛道首次组织了一项全面的共享任务，以测试文本挖掘算法从生物医学文本中的证据陈述中提取语义上有意义的断言的稳健性。在这项工作中，我们测试了一个基于规则的语义解析器从生物医学文献中挑选出的证据句子中提取生物表达语言（BEL）陈述的能力，这是生物创意V挑战赛第4赛道挑战的一部分。该系统在提取完整的BEL陈述方面总体最佳F值为21.29%。对于关系提取，该系统在测试数据集上的F值为65.13%。我们的系统在任务组织者采用的六个评估标准中的五个方面取得了最佳性能。缺乏推导语义推理的能力、将文本提取映射到BEL函数的规则集的局限性是提取完整BEL陈述时性能较低的一些原因。在共享任务之后，我们还评估了不同命名实体识别组件对在测试数据集上提取BEL陈述能力的影响，此外还对将关系提取转换为BEL陈述的规则集进行了单一更改。BELMiner在测试集上提取BEL陈述的能力的整体性能有超过20%的显著提高。该系统可通过REST-API在http://54.146.11.205:8484/BELXtractor/finder/获取。

数据库网址

http://54.146.11.205:8484/BELXtractor/finder/。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f383/5467463/ba241431e21a/baw156f1.jpg

相似文献

BELMiner: adapting a rule-based relation extraction system to extract biological expression language statements from bio-medical literature evidence sentences.

Database (Oxford). 2017 Jan 1;2017(1). doi: 10.1093/database/baw156.

BELTracker: evidence sentence retrieval for BEL statements.

Database (Oxford). 2016 May 12;2016. doi: 10.1093/database/baw079. Print 2016.

BelSmile: a biomedical semantic role labeling approach for extracting biological expression language from text.

Database (Oxford). 2016 May 12;2016. doi: 10.1093/database/baw064. Print 2016.

Hierarchical sequence labeling for extracting BEL statements from biomedical literature.

BMC Med Inform Decis Mak. 2019 Apr 9;19(Suppl 2):63. doi: 10.1186/s12911-019-0758-3.

Combining relation extraction with function detection for BEL statement extraction.

Database (Oxford). 2019 Jan 1;2019:bay133. doi: 10.1093/database/bay133.

Training and evaluation corpora for the extraction of causal relationships encoded in biological expression language (BEL).

Database (Oxford). 2016 Aug 23;2016. doi: 10.1093/database/baw113. Print 2016.

The extraction of complex relationships and their conversion to biological expression language (BEL) overview of the BioCreative VI (2017) BEL track.

Database (Oxford). 2019 Jan 1;2019. doi: 10.1093/database/baz084.

Extraction of causal relations based on SBEL and BERT model.

Database (Oxford). 2021 Feb 18;2021. doi: 10.1093/database/baab005.

Coreference resolution improves extraction of Biological Expression Language statements from texts.

Database (Oxford). 2016 Jul 3;2016. doi: 10.1093/database/baw076. Print 2016.

The BEL information extraction workflow (BELIEF): evaluation in the BioCreative V BEL and IAT track.

Database (Oxford). 2016 Oct 2;2016. doi: 10.1093/database/baw136. Print 2016.

引用本文的文献

KGG: a fully automated workflow for creating disease-specific knowledge graphs.

Bioinformatics. 2025 Jul 1;41(7). doi: 10.1093/bioinformatics/btaf383.

A reproducible experimental survey on biomedical sentence similarity: A string-based method sets the state of the art.

PLoS One. 2022 Nov 21;17(11):e0276539. doi: 10.1371/journal.pone.0276539. eCollection 2022.

Data Integration Challenges for Machine Learning in Precision Medicine.

Front Med (Lausanne). 2022 Jan 25;8:784455. doi: 10.3389/fmed.2021.784455. eCollection 2021.

Deep semi-supervised learning ensemble framework for classifying co-mentions of human proteins and phenotypes.

BMC Bioinformatics. 2021 Oct 16;22(1):500. doi: 10.1186/s12859-021-04421-z.

Mining a stroke knowledge graph from literature.

BMC Bioinformatics. 2021 Jul 29;22(Suppl 10):387. doi: 10.1186/s12859-021-04292-4.

Protocol for a reproducible experimental survey on biomedical sentence similarity.

PLoS One. 2021 Mar 24;16(3):e0248663. doi: 10.1371/journal.pone.0248663. eCollection 2021.

Extraction of causal relations based on SBEL and BERT model.

Database (Oxford). 2021 Feb 18;2021. doi: 10.1093/database/baab005.

Constructing knowledge graphs and their biomedical applications.

Comput Struct Biotechnol J. 2020 Jun 2;18:1414-1428. doi: 10.1016/j.csbj.2020.05.017. eCollection 2020.

Deep learning with sentence embeddings pre-trained on biomedical corpora improves the performance of finding similar sentences in electronic medical records.

BMC Med Inform Decis Mak. 2020 Apr 30;20(Suppl 1):73. doi: 10.1186/s12911-020-1044-0.

Using a Large Margin Context-Aware Convolutional Neural Network to Automatically Extract Disease-Disease Association from Literature: Comparative Analytic Study.

JMIR Med Inform. 2019 Nov 26;7(4):e14502. doi: 10.2196/14502.

本文引用的文献

Training and evaluation corpora for the extraction of causal relationships encoded in biological expression language (BEL).

Database (Oxford). 2016 Aug 23;2016. doi: 10.1093/database/baw113. Print 2016.

BioCreative V track 4: a shared task for the extraction of causal network information using the Biological Expression Language.

Database (Oxford). 2016 Jul 9;2016. doi: 10.1093/database/baw067. Print 2016.

Recent Advances and Emerging Applications in Text and Data Mining for Biomedical Discovery.

Brief Bioinform. 2016 Jan;17(1):33-42. doi: 10.1093/bib/bbv087. Epub 2015 Sep 29.

Development and application of an interaction network ontology for literature mining of vaccine-associated gene-gene interactions.

J Biomed Semantics. 2015 Jan 6;6:2. doi: 10.1186/2041-1480-6-2. eCollection 2015.

Overview of the gene ontology task at BioCreative IV.

Database (Oxford). 2014 Aug 25;2014. doi: 10.1093/database/bau086. Print 2014.

Towards pathway curation through literature mining--a case study using PharmGKB.

Pac Symp Biocomput. 2014:352-63.

BeCAS: biomedical concept recognition services and visualization.

Bioinformatics. 2013 Aug 1;29(15):1915-6. doi: 10.1093/bioinformatics/btt317. Epub 2013 Jun 4.

PubTator: a web-based text mining tool for assisting biocuration.

Nucleic Acids Res. 2013 Jul;41(Web Server issue):W518-22. doi: 10.1093/nar/gkt441. Epub 2013 May 22.

The gene normalization task in BioCreative III.

BMC Bioinformatics. 2011 Oct 3;12 Suppl 8(Suppl 8):S2. doi: 10.1186/1471-2105-12-S8-S2.

The BioPAX community standard for pathway data sharing.

Nat Biotechnol. 2010 Sep;28(9):935-42. doi: 10.1038/nbt.1666. Epub 2010 Sep 9.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

BELMiner：调整基于规则的关系提取系统，以从生物医学文献证据句子中提取生物表达语言陈述。

BELMiner: adapting a rule-based relation extraction system to extract biological expression language statements from bio-medical literature evidence sentences.

作者信息

机构信息

出版信息

UNLABELLED

DATABASE URL

未标注

数据库网址

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献