生物多样性观测挖掘器：一个用于从已发表文献中解锁原始生物多样性数据的网络应用程序。

Biodiversity Observations Miner: A web application to unlock primary biodiversity data from published literature.

作者信息

Muñoz Gabriel, Kissling W Daniel, van Loon E Emiel

机构信息

NASUA, Biodiversity research and conservation section, Quito, Ecuador NASUA, Biodiversity research and conservation section Quito Ecuador.

Faculty of Arts and Science, Department of Biology, Concordia University, Montreal, Canada Faculty of Arts and Science, Department of Biology, Concordia University Montreal Canada.

出版信息

Biodivers Data J. 2019 Jan 16(7):e28737. doi: 10.3897/BDJ.7.e28737. eCollection 2019.

DOI:10.3897/BDJ.7.e28737

PMID:30692868

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6344444/

Abstract

BACKGROUND

A considerable portion of primary biodiversity data is digitally locked inside published literature which is often stored as pdf files. Large-scale approaches to biodiversity science could benefit from retrieving this information and making it digitally accessible and machine-readable. Nonetheless, the amount and diversity of digitally published literature pose many challenges for knowledge discovery and retrieval. Text mining has been extensively used for data discovery tasks in large quantities of documents. However, text mining approaches for knowledge discovery and retrieval have been limited in biodiversity science compared to other disciplines.

NEW INFORMATION

Here, we present a novel, open source text mining tool, the This web application, written in R, allows the semi-automated discovery of punctual biodiversity observations (e.g. biotic interactions, functional or behavioural traits and natural history descriptions) associated with the scientific names present inside a corpus of scientific literature. Furthermore, BOM enable users the rapid screening of large quantities of literature based on word co-occurrences that match custom biodiversity dictionaries. This tool aims to increase the digital mobilisation of primary biodiversity data and is freely accessible via GitHub or through a web server.

摘要

背景

相当一部分原始生物多样性数据被数字锁定在已发表的文献中，这些文献通常以PDF文件形式存储。生物多样性科学的大规模研究方法可能会从检索这些信息并使其数字化可访问和机器可读中受益。尽管如此，数字出版文献的数量和多样性给知识发现和检索带来了许多挑战。文本挖掘已被广泛用于大量文档中的数据发现任务。然而，与其他学科相比，生物多样性科学中用于知识发现和检索的文本挖掘方法一直受到限制。

新信息

在此，我们展示了一种新颖的开源文本挖掘工具——BOM。这个用R编写的网络应用程序允许半自动发现与科学文献语料库中出现的科学名称相关的点状生物多样性观察结果（例如生物相互作用、功能或行为特征以及自然历史描述）。此外，BOM使用户能够基于与自定义生物多样性词典匹配的词共现情况快速筛选大量文献。该工具旨在提高原始生物多样性数据的数字流通性，可通过GitHub或网络服务器免费访问。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ea54/6344444/867b282032de/bdj-07-e28737-g001.jpg

相似文献

Biodiversity Observations Miner: A web application to unlock primary biodiversity data from published literature.

Biodivers Data J. 2019 Jan 16(7):e28737. doi: 10.3897/BDJ.7.e28737. eCollection 2019.

Constructing a biodiversity terminological inventory.

PLoS One. 2017 Apr 17;12(4):e0175277. doi: 10.1371/journal.pone.0175277. eCollection 2017.

The Natural Products Atlas: An Open Access Knowledge Base for Microbial Natural Products Discovery.

ACS Cent Sci. 2019 Nov 27;5(11):1824-1833. doi: 10.1021/acscentsci.9b00806. Epub 2019 Nov 14.

NetiNeti: discovery of scientific names from text using machine learning methods.

BMC Bioinformatics. 2012 Aug 22;13:211. doi: 10.1186/1471-2105-13-211.

Applications of natural language processing in biodiversity science.

Adv Bioinformatics. 2012;2012:391574. doi: 10.1155/2012/391574. Epub 2012 May 22.

Recognition of Latin scientific names using artificial neural networks.

Appl Plant Sci. 2020 Jul 31;8(7):e11378. doi: 10.1002/aps3.11378. eCollection 2020 Jul.

Knowledge based word-concept model estimation and refinement for biomedical text mining.

J Biomed Inform. 2015 Feb;53:300-7. doi: 10.1016/j.jbi.2014.11.015. Epub 2014 Dec 12.

Hydroids (Cnidaria, Hydrozoa) from Mauritanian Coral Mounds.

Zootaxa. 2020 Nov 16;4878(3):zootaxa.4878.3.2. doi: 10.11646/zootaxa.4878.3.2.

Text mining tools for extracting information about microbial biodiversity in food.

Food Microbiol. 2019 Aug;81:63-75. doi: 10.1016/j.fm.2018.04.011. Epub 2018 Apr 21.

Enriched biodiversity data as a resource and service.

Biodivers Data J. 2014 Jun 16(2):e1125. doi: 10.3897/BDJ.2.e1125. eCollection 2014.

引用本文的文献

Past and future uses of text mining in ecology and evolution.

Proc Biol Sci. 2022 May 25;289(1975):20212721. doi: 10.1098/rspb.2021.2721. Epub 2022 May 18.

本文引用的文献

Towards global data products of Essential Biodiversity Variables on species traits.

Nat Ecol Evol. 2018 Oct;2(10):1531-1540. doi: 10.1038/s41559-018-0667-3. Epub 2018 Sep 17.

OpenBiodiv-O: ontology of the OpenBiodiv knowledge management system.

J Biomed Semantics. 2018 Jan 18;9(1):5. doi: 10.1186/s13326-017-0174-5.

"gnparser": a powerful parser for scientific names based on Parsing Expression Grammar.

BMC Bioinformatics. 2017 May 26;18(1):279. doi: 10.1186/s12859-017-1663-3.

Constructing a biodiversity terminological inventory.

PLoS One. 2017 Apr 17;12(4):e0175277. doi: 10.1371/journal.pone.0175277. eCollection 2017.

The FAIR Guiding Principles for scientific data management and stewardship.

Sci Data. 2016 Mar 15;3:160018. doi: 10.1038/sdata.2016.18.

Towards a Global Names Architecture: The future of indexing scientific names.

Zookeys. 2016 Jan 7(550):261-81. doi: 10.3897/zookeys.550.10009. eCollection 2016.

Establishing macroecological trait datasets: digitalization, extrapolation, and validation of diet preferences in terrestrial mammals worldwide.

Ecol Evol. 2014 Jul;4(14):2913-30. doi: 10.1002/ece3.1136. Epub 2014 Jun 16.

Systematic drug repurposing through text mining.

Methods Mol Biol. 2014;1159:253-67. doi: 10.1007/978-1-4939-0709-0_14.

Predicting future discoveries from current scientific literature.

Methods Mol Biol. 2014;1159:159-68. doi: 10.1007/978-1-4939-0709-0_10.

Introduction to biomedical literature text mining: context and objectives.

Methods Mol Biol. 2014;1159:1-7. doi: 10.1007/978-1-4939-0709-0_1.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

生物多样性观测挖掘器：一个用于从已发表文献中解锁原始生物多样性数据的网络应用程序。

Biodiversity Observations Miner: A web application to unlock primary biodiversity data from published literature.

作者信息

Muñoz Gabriel, Kissling W Daniel, van Loon E Emiel

机构信息

NASUA, Biodiversity research and conservation section, Quito, Ecuador NASUA, Biodiversity research and conservation section Quito Ecuador.

Faculty of Arts and Science, Department of Biology, Concordia University, Montreal, Canada Faculty of Arts and Science, Department of Biology, Concordia University Montreal Canada.

出版信息

Biodivers Data J. 2019 Jan 16(7):e28737. doi: 10.3897/BDJ.7.e28737. eCollection 2019.

DOI:10.3897/BDJ.7.e28737

PMID:30692868

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6344444/

Abstract

BACKGROUND

NEW INFORMATION

摘要

生物多样性观测挖掘器：一个用于从已发表文献中解锁原始生物多样性数据的网络应用程序。

Biodiversity Observations Miner: A web application to unlock primary biodiversity data from published literature.

作者信息

机构信息

出版信息

BACKGROUND

NEW INFORMATION

背景

新信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

生物多样性观测挖掘器：一个用于从已发表文献中解锁原始生物多样性数据的网络应用程序。

Biodiversity Observations Miner: A web application to unlock primary biodiversity data from published literature.

作者信息

机构信息

出版信息

BACKGROUND

NEW INFORMATION

背景

新信息