Suppr超能文献

SIENA:使用概念识别对数据集进行半自动语义增强。

SIENA: Semi-automatic semantic enhancement of datasets using concept recognition.

机构信息

Institute of Data Science, Maastricht University, Universiteitsingel 60, Maastricht, 6229 ER, Netherlands.

Department of Data Science and Knowledge Engineering, Maastricht University, Paul-Henri Spaaklaan 1, Maastricht, 6229 EN, Netherlands.

出版信息

J Biomed Semantics. 2021 Mar 24;12(1):5. doi: 10.1186/s13326-021-00239-z.

Abstract

BACKGROUND

The amount of available data, which can facilitate answering scientific research questions, is growing. However, the different formats of published data are expanding as well, creating a serious challenge when multiple datasets need to be integrated for answering a question.

RESULTS

This paper presents a semi-automated framework that provides semantic enhancement of biomedical data, specifically gene datasets. The framework involved a concept recognition task using machine learning, in combination with the BioPortal annotator. Compared to using methods which require only the BioPortal annotator for semantic enhancement, the proposed framework achieves the highest results.

CONCLUSIONS

Using concept recognition combined with machine learning techniques and annotation with a biomedical ontology, the proposed framework can provide datasets to reach their full potential of providing meaningful information, which can answer scientific research questions.

摘要

背景

可用于回答科学研究问题的可用数据量正在增加。然而,随着已发表数据格式的不断增加,当需要整合多个数据集来回答一个问题时,这就带来了严重的挑战。

结果

本文提出了一个半自动框架,为生物医学数据,特别是基因数据集提供语义增强。该框架涉及使用机器学习的概念识别任务,结合了 BioPortal 标注器。与仅使用 BioPortal 标注器进行语义增强的方法相比,所提出的框架实现了最高的效果。

结论

使用概念识别结合机器学习技术,并结合生物医学本体进行注释,所提出的框架可以提供数据集,以充分发挥其提供有意义信息的潜力,从而回答科学研究问题。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/83b7/7992819/5f90400ef18b/13326_2021_239_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验