Suppr超能文献

通过多维语义空间探索和链接生物医学资源。

Exploring and linking biomedical resources through multidimensional semantic spaces.

出版信息

BMC Bioinformatics. 2012 Jan 25;13 Suppl 1(Suppl 1):S6. doi: 10.1186/1471-2105-13-S1-S6.

Abstract

BACKGROUND

The semantic integration of biomedical resources is still a challenging issue which is required for effective information processing and data analysis. The availability of comprehensive knowledge resources such as biomedical ontologies and integrated thesauri greatly facilitates this integration effort by means of semantic annotation, which allows disparate data formats and contents to be expressed under a common semantic space. In this paper, we propose a multidimensional representation for such a semantic space, where dimensions regard the different perspectives in biomedical research (e.g., population, disease, anatomy and protein/genes).

RESULTS

This paper presents a novel method for building multidimensional semantic spaces from semantically annotated biomedical data collections. This method consists of two main processes: knowledge and data normalization. The former one arranges the concepts provided by a reference knowledge resource (e.g., biomedical ontologies and thesauri) into a set of hierarchical dimensions for analysis purposes. The latter one reduces the annotation set associated to each collection item into a set of points of the multidimensional space. Additionally, we have developed a visual tool, called 3D-Browser, which implements OLAP-like operators over the generated multidimensional space. The method and the tool have been tested and evaluated in the context of the Health-e-Child (HeC) project. Automatic semantic annotation was applied to tag three collections of abstracts taken from PubMed, one for each target disease of the project, the Uniprot database, and the HeC patient record database. We adopted the UMLS Meta-thesaurus 2010AA as the reference knowledge resource.

CONCLUSIONS

Current knowledge resources and semantic-aware technology make possible the integration of biomedical resources. Such an integration is performed through semantic annotation of the intended biomedical data resources. This paper shows how these annotations can be exploited for integration, exploration, and analysis tasks. Results over a real scenario demonstrate the viability and usefulness of the approach, as well as the quality of the generated multidimensional semantic spaces.

摘要

背景

生物医学资源的语义集成仍然是一个具有挑战性的问题,这对于有效的信息处理和数据分析是必需的。全面的知识资源(如生物医学本体和集成词库)的可用性极大地促进了这种集成工作,其方式是通过语义注释,从而使不同的数据格式和内容可以在共同的语义空间中表达。在本文中,我们提出了一种多维表示,其中维度涉及生物医学研究的不同视角(例如,人群、疾病、解剖和蛋白质/基因)。

结果

本文提出了一种从语义注释的生物医学数据集中构建多维语义空间的新方法。该方法包括两个主要过程:知识和数据规范化。前者将参考知识资源(例如生物医学本体和词库)提供的概念安排到一组用于分析的层次维度中。后者将与每个集合项相关联的注释集减少为多维空间中的一组点。此外,我们还开发了一个称为 3D-Browser 的可视化工具,该工具在生成的多维空间上实现了 OLAP 类似的操作符。该方法和工具已在 Health-e-Child(HeC)项目中进行了测试和评估。自动语义注释应用于从 PubMed 标记三个摘要集,每个项目都针对项目的目标疾病之一,UniProt 数据库和 HeC 患者记录数据库。我们采用 UMLS Meta-thesaurus 2010AA 作为参考知识资源。

结论

当前的知识资源和语义感知技术使生物医学资源的集成成为可能。这种集成是通过对预期的生物医学数据资源进行语义注释来实现的。本文展示了如何利用这些注释来执行集成、探索和分析任务。真实场景中的结果证明了该方法的可行性和有用性,以及生成的多维语义空间的质量。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/944b/3471347/d64cd6c51e0f/1471-2105-13-S1-S6-1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验