网络生命科学链接开放数据的实证元分析。

An empirical meta-analysis of the life sciences linked open data on the web.

机构信息

Center for Biomedical Informatics Research, Stanford University, Stanford, CA, USA.

Elsevier Health Markets, Philadelphia, PA, USA.

出版信息

Sci Data. 2021 Jan 21;8(1):24. doi: 10.1038/s41597-021-00797-y.

DOI:10.1038/s41597-021-00797-y

PMID:33479214

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7819992/

Abstract

While the biomedical community has published several "open data" sources in the last decade, most researchers still endure severe logistical and technical challenges to discover, query, and integrate heterogeneous data and knowledge from multiple sources. To tackle these challenges, the community has experimented with Semantic Web and linked data technologies to create the Life Sciences Linked Open Data (LSLOD) cloud. In this paper, we extract schemas from more than 80 biomedical linked open data sources into an LSLOD schema graph and conduct an empirical meta-analysis to evaluate the extent of semantic heterogeneity across the LSLOD cloud. We observe that several LSLOD sources exist as stand-alone data sources that are not inter-linked with other sources, use unpublished schemas with minimal reuse or mappings, and have elements that are not useful for data integration from a biomedical perspective. We envision that the LSLOD schema graph and the findings from this research will aid researchers who wish to query and integrate data and knowledge from multiple biomedical sources simultaneously on the Web.

摘要

虽然生物医学界在过去十年中发布了几个“开放数据”资源，但大多数研究人员在发现、查询和整合来自多个来源的异构数据和知识方面仍然面临严重的后勤和技术挑战。为了解决这些挑战，该社区尝试了语义 Web 和链接数据技术，以创建生命科学链接开放数据 (LSLOD) 云。在本文中，我们从 80 多个生物医学链接开放数据资源中提取模式到 LSLOD 模式图中，并进行实证元分析来评估 LSLOD 云中语义异构的程度。我们观察到，一些 LSLOD 资源作为独立数据源存在，与其他数据源没有相互链接，使用未公开的模式，最小限度地重用或映射，并且从生物医学角度来看，有些元素对于数据集成没有用处。我们设想 LSLOD 模式图和这项研究的结果将帮助那些希望在 Web 上同时查询和整合来自多个生物医学源的数据和知识的研究人员。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cd1e/7819992/e0dce1df16f3/41597_2021_797_Fig1_HTML.jpg

相似文献

An empirical meta-analysis of the life sciences linked open data on the web.网络生命科学链接开放数据的实证元分析。

Sci Data. 2021 Jan 21;8(1):24. doi: 10.1038/s41597-021-00797-y.

Enabling Web-scale data integration in biomedicine through Linked Open Data.通过关联开放数据实现生物医学领域的网络规模数据集成。

NPJ Digit Med. 2019 Sep 10;2:90. doi: 10.1038/s41746-019-0162-5. eCollection 2019.

PhLeGrA: Graph Analytics in Pharmacology over the Web of Life Sciences Linked Open Data.PhLeGrA：生命科学链接开放数据网络上的药理学图形分析

Proc Int World Wide Web Conf. 2017 Apr;2017:321-329. doi: 10.1145/3038912.3052692.

A journey to Semantic Web query federation in the life sciences.生命科学中的语义网查询联邦之旅。

BMC Bioinformatics. 2009 Oct 1;10 Suppl 10(Suppl 10):S10. doi: 10.1186/1471-2105-10-S10-S10.

An ontology-driven semantic mashup of gene and biological pathway information: application to the domain of nicotine dependence.基于本体驱动的基因与生物通路信息语义混搭：在尼古丁依赖领域的应用

J Biomed Inform. 2008 Oct;41(5):752-65. doi: 10.1016/j.jbi.2008.02.006. Epub 2008 Feb 29.

Semantic Web technologies for the big data in life sciences.语义网技术在生命科学大数据中的应用。

Biosci Trends. 2014 Aug;8(4):192-201. doi: 10.5582/bst.2014.01048.

Generation of open biomedical datasets through ontology-driven transformation and integration processes.通过本体驱动的转换和集成过程生成开放生物医学数据集。

J Biomed Semantics. 2016 Jun 3;7:32. doi: 10.1186/s13326-016-0075-z.

Towards virtual knowledge broker services for semantic integration of life science literature and data sources.面向生命科学文献和数据源语义集成的虚拟知识经纪人服务。

Drug Discov Today. 2013 May;18(9-10):428-34. doi: 10.1016/j.drudis.2012.11.012. Epub 2012 Dec 12.

BioFed: federated query processing over life sciences linked open data.BioFed：基于生命科学关联开放数据的联邦查询处理

J Biomed Semantics. 2017 Mar 15;8(1):13. doi: 10.1186/s13326-017-0118-0.

Life sciences on the Semantic Web: the Neurocommons and beyond.语义网中的生命科学：神经公共领域及其他。

Brief Bioinform. 2009 Mar;10(2):193-204. doi: 10.1093/bib/bbp004. Epub 2009 Mar 12.

引用本文的文献

Artificial intelligence: the human response to approach the complexity of big data in biology.人工智能：人类应对生物学大数据复杂性的方式

Gigascience. 2025 Jan 6;14. doi: 10.1093/gigascience/giaf057.

Computational tools and data integration to accelerate vaccine development: challenges, opportunities, and future directions.加速疫苗开发的计算工具与数据整合：挑战、机遇及未来方向

Front Immunol. 2025 Mar 7;16:1502484. doi: 10.3389/fimmu.2025.1502484. eCollection 2025.

Snowflake Data Warehouse for Large-Scale and Diverse Biological Data Management and Analysis.用于大规模和多样化生物数据管理与分析的雪花数据仓库。

Genes (Basel). 2024 Dec 28;16(1):34. doi: 10.3390/genes16010034.

Generic and queryable data integration schema for transcriptomics and epigenomics studies.用于转录组学和表观基因组学研究的通用且可查询的数据整合模式。

Comput Struct Biotechnol J. 2024 Nov 19;23:4232-4241. doi: 10.1016/j.csbj.2024.11.022. eCollection 2024 Dec.

Regulus infers signed regulatory relations from few samples' information using discretization and likelihood constraints.Regulus 通过离散化和似然约束，从少量样本的信息中推断出有符号的调控关系。

PLoS Comput Biol. 2024 Jan 22;20(1):e1011816. doi: 10.1371/journal.pcbi.1011816. eCollection 2024 Jan.

Metadata integrity in bioinformatics: Bridging the gap between data and knowledge.生物信息学中的元数据完整性：弥合数据与知识之间的差距。

Comput Struct Biotechnol J. 2023 Oct 5;21:4895-4913. doi: 10.1016/j.csbj.2023.10.006. eCollection 2023.

Specimen, biological structure, and spatial ontologies in support of a Human Reference Atlas.支持人类参考图谱的样本、生物结构和空间本体论。

Sci Data. 2023 Mar 27;10(1):171. doi: 10.1038/s41597-023-01993-8.

Moving Toward Findable, Accessible, Interoperable, Reusable Practices in Epidemiologic Research.迈向流行病学研究中可发现、可访问、可互操作和可重复使用的实践。

Am J Epidemiol. 2023 Jun 2;192(6):995-1005. doi: 10.1093/aje/kwad040.

Data platforms for open life sciences-A systematic analysis of management instruments.开放生命科学的数据平台——管理工具的系统分析。

PLoS One. 2022 Oct 25;17(10):e0276204. doi: 10.1371/journal.pone.0276204. eCollection 2022.

A COMPASS for VESPUCCI: A FAIR Way to Explore the Grapevine Transcriptomic Landscape.VESPUCCI的指南针：探索葡萄转录组景观的公平方式。

Front Plant Sci. 2022 Feb 24;13:815443. doi: 10.3389/fpls.2022.815443. eCollection 2022.

本文引用的文献

Text Snippets to Corroborate Medical Relations: An Unsupervised Approach using a Knowledge Graph and Embeddings.用于确证医学关系的文本片段：一种使用知识图谱和嵌入的无监督方法。

AMIA Jt Summits Transl Sci Proc. 2020 May 30;2020:288-297. eCollection 2020.

Enabling Web-scale data integration in biomedicine through Linked Open Data.通过关联开放数据实现生物医学领域的网络规模数据集成。

NPJ Digit Med. 2019 Sep 10;2:90. doi: 10.1038/s41746-019-0162-5. eCollection 2019.

NBDC RDF portal: a comprehensive repository for semantic data in life sciences.NBDC RDF 门户：生命科学中语义数据的综合知识库。

Database (Oxford). 2018 Jan 1;2018:bay123. doi: 10.1093/database/bay123.

Investigating Term Reuse and Overlap in Biomedical Ontologies.研究生物医学本体中的术语复用与重叠

CEUR Workshop Proc. 2015 Jul;1515. Epub 2015 Nov 18.

A global network of biomedical relationships derived from text.从文本中提取的生物医学关系的全球网络。

Bioinformatics. 2018 Aug 1;34(15):2614-2624. doi: 10.1093/bioinformatics/bty114.

PhLeGrA: Graph Analytics in Pharmacology over the Web of Life Sciences Linked Open Data.PhLeGrA：生命科学链接开放数据网络上的药理学图形分析

Proc Int World Wide Web Conf. 2017 Apr;2017:321-329. doi: 10.1145/3038912.3052692.

Predicting species emergence in simulated complex pre-biotic networks.预测模拟复杂前生物网络中的物种出现。

PLoS One. 2018 Feb 15;13(2):e0192871. doi: 10.1371/journal.pone.0192871. eCollection 2018.

A Systematic Analysis of Term Reuse and Term Overlap across Biomedical Ontologies.生物医学本体中术语复用与术语重叠的系统分析

Semant Web. 2017;8(6):853-871. doi: 10.3233/sw-160238.

DisGeNET: a comprehensive platform integrating information on human disease-associated genes and variants.DisGeNET：一个整合人类疾病相关基因和变异信息的综合平台。

Nucleic Acids Res. 2017 Jan 4;45(D1):D833-D839. doi: 10.1093/nar/gkw943. Epub 2016 Oct 19.

The Monarch Initiative: an integrative data and analytic platform connecting phenotypes to genotypes across species.君主计划：一个跨物种将表型与基因型相联系的综合数据与分析平台。

Nucleic Acids Res. 2017 Jan 4;45(D1):D712-D722. doi: 10.1093/nar/gkw1128. Epub 2016 Nov 29.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

网络生命科学链接开放数据的实证元分析。

An empirical meta-analysis of the life sciences linked open data on the web.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献