• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

在电子科学时代实现 FAIR 数据表示:以表型描述为例比较基于实例和基于类的经验数据语义表示。

FAIR data representation in times of eScience: a comparison of instance-based and class-based semantic representations of empirical data using phenotype descriptions as example.

机构信息

TIB Leibniz Information Centre for Science and Technology, Welfengarten 1B, 30167, Hanover, Germany.

出版信息

J Biomed Semantics. 2021 Nov 25;12(1):20. doi: 10.1186/s13326-021-00254-0.

DOI:10.1186/s13326-021-00254-0
PMID:34823588
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8613519/
Abstract

BACKGROUND

The size, velocity, and heterogeneity of Big Data outclasses conventional data management tools and requires data and metadata to be fully machine-actionable (i.e., eScience-compliant) and thus findable, accessible, interoperable, and reusable (FAIR). This can be achieved by using ontologies and through representing them as semantic graphs. Here, we discuss two different semantic graph approaches of representing empirical data and metadata in a knowledge graph, with phenotype descriptions as an example. Almost all phenotype descriptions are still being published as unstructured natural language texts, with far-reaching consequences for their FAIRness, substantially impeding their overall usability within the life sciences. However, with an increasing amount of anatomy ontologies becoming available and semantic applications emerging, a solution to this problem becomes available. Researchers are starting to document and communicate phenotype descriptions through the Web in the form of highly formalized and structured semantic graphs that use ontology terms and Uniform Resource Identifiers (URIs) to circumvent the problems connected with unstructured texts.

RESULTS

Using phenotype descriptions as an example, we compare and evaluate two basic representations of empirical data and their accompanying metadata in the form of semantic graphs: the class-based TBox semantic graph approach called Semantic Phenotype and the instance-based ABox semantic graph approach called Phenotype Knowledge Graph. Their main difference is that only the ABox approach allows for identifying every individual part and property mentioned in the description in a knowledge graph. This technical difference results in substantial practical consequences that significantly affect the overall usability of empirical data. The consequences affect findability, accessibility, and explorability of empirical data as well as their comparability, expandability, universal usability and reusability, and overall machine-actionability. Moreover, TBox semantic graphs often require querying under entailment regimes, which is computationally more complex.

CONCLUSIONS

We conclude that, from a conceptual point of view, the advantages of the instance-based ABox semantic graph approach outweigh its shortcomings and outweigh the advantages of the class-based TBox semantic graph approach. Therefore, we recommend the instance-based ABox approach as a FAIR approach for documenting and communicating empirical data and metadata in a knowledge graph.

摘要

背景

大数据的规模、速度和异质性超过了传统的数据管理工具,需要将数据和元数据完全实现机器可操作(即符合 eScience 标准),从而实现可查找、可访问、可互操作和可重用(FAIR)。这可以通过使用本体并将其表示为语义图来实现。在这里,我们讨论了两种不同的语义图方法,用于在知识图中表示经验数据和元数据,以表型描述为例。几乎所有的表型描述仍然以非结构化的自然语言文本形式发布,这对其 FAIR 性产生了深远的影响,极大地阻碍了它们在生命科学中的整体可用性。然而,随着越来越多的解剖学本体可用,以及语义应用的出现,这个问题的解决方案也随之出现。研究人员开始以高度形式化和结构化的语义图的形式,通过网络记录和交流表型描述,这些语义图使用本体术语和统一资源标识符(URIs)来规避与非结构化文本相关的问题。

结果

以表型描述为例,我们比较和评估了两种以语义图形式表示经验数据及其伴随元数据的基本表示方法:称为语义表型的基于类的 TBox 语义图方法和称为表型知识图的基于实例的 ABox 语义图方法。它们的主要区别在于,只有 ABox 方法允许在知识图中标识描述中提到的每个个体部分和属性。这种技术差异导致了实质性的实际后果,这些后果显著影响了经验数据的整体可用性。这些后果影响了经验数据的可查找性、可访问性和可探索性,以及它们的可比性、可扩展性、普遍可用性和可重用性,以及整体的机器可操作性。此外,TBox 语义图通常需要在蕴涵规则下进行查询,这在计算上更加复杂。

结论

从概念的角度来看,我们得出结论,基于实例的 ABox 语义图方法的优势超过了其缺点,也超过了基于类的 TBox 语义图方法的优势。因此,我们推荐基于实例的 ABox 方法作为在知识图中记录和交流经验数据和元数据的 FAIR 方法。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d1d6/8613992/672188b9865c/13326_2021_254_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d1d6/8613992/570fe4c270c8/13326_2021_254_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d1d6/8613992/3c540ceaa6b1/13326_2021_254_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d1d6/8613992/8216a6aaac0d/13326_2021_254_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d1d6/8613992/cf4ff87fd642/13326_2021_254_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d1d6/8613992/0757f788f023/13326_2021_254_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d1d6/8613992/672188b9865c/13326_2021_254_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d1d6/8613992/570fe4c270c8/13326_2021_254_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d1d6/8613992/3c540ceaa6b1/13326_2021_254_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d1d6/8613992/8216a6aaac0d/13326_2021_254_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d1d6/8613992/cf4ff87fd642/13326_2021_254_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d1d6/8613992/0757f788f023/13326_2021_254_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d1d6/8613992/672188b9865c/13326_2021_254_Fig6_HTML.jpg

相似文献

1
FAIR data representation in times of eScience: a comparison of instance-based and class-based semantic representations of empirical data using phenotype descriptions as example.在电子科学时代实现 FAIR 数据表示:以表型描述为例比较基于实例和基于类的经验数据语义表示。
J Biomed Semantics. 2021 Nov 25;12(1):20. doi: 10.1186/s13326-021-00254-0.
2
Organizing phenotypic data-a semantic data model for anatomy.组织表型数据——一种解剖学语义数据模型。
J Biomed Semantics. 2019 Jun 20;10(1):12. doi: 10.1186/s13326-019-0204-6.
3
Semantic units: organizing knowledge graphs into semantically meaningful units of representation.语义单元:将知识图组织成具有语义意义的表示单元。
J Biomed Semantics. 2024 May 27;15(1):7. doi: 10.1186/s13326-024-00310-5.
4
Anatomy and the type concept in biology show that ontologies must be adapted to the diagnostic needs of research.解剖学和生物学中的类型概念表明,本体论必须适应研究的诊断需求。
J Biomed Semantics. 2022 Jun 27;13(1):18. doi: 10.1186/s13326-022-00268-2.
5
SOCCOMAS: a FAIR web content management system that uses knowledge graphs and that is based on semantic programming.SOCCOMAS:一个使用知识图谱且基于语义编程的 FAIR 网络内容管理系统。
Database (Oxford). 2019 Jan 1;2019. doi: 10.1093/database/baz067.
6
Querying phenotype-genotype relationships on patient datasets using semantic web technology: the example of Cerebrotendinous xanthomatosis.使用语义网技术在患者数据集上查询表型-基因型关系:以脑腱黄瘤病为例。
BMC Med Inform Decis Mak. 2012 Jul 31;12:78. doi: 10.1186/1472-6947-12-78.
7
FAIR-Checker: supporting digital resource findability and reuse with Knowledge Graphs and Semantic Web standards.FAIR-Checker:利用知识图谱和语义 Web 标准支持数字资源的可发现性和再利用。
J Biomed Semantics. 2023 Jul 1;14(1):7. doi: 10.1186/s13326-023-00289-5.
8
Constructing High-Fidelity Phenotype Knowledge Graphs for Infectious Diseases With a Fine-Grained Semantic Information Model: Development and Usability Study.基于细粒度语义信息模型构建传染病高保真表型知识图谱:开发与可用性研究。
J Med Internet Res. 2021 Jun 15;23(6):e26892. doi: 10.2196/26892.
9
FAIR and Interactive Data Graphics from a Scientific Knowledge Graph.从科学知识图谱中获取公平且交互式的数据图形。
Sci Data. 2022 May 27;9(1):239. doi: 10.1038/s41597-022-01352-z.
10
Ontologies4Cat: investigating the landscape of ontologies for catalysis research data management.用于催化研究数据管理的本体论4Cat:探究本体论格局
J Cheminform. 2024 Feb 7;16(1):16. doi: 10.1186/s13321-024-00807-2.

引用本文的文献

1
Computable species descriptions and nanopublications: applying ontology-based technologies to dung beetles (Coleoptera, Scarabaeinae).可计算的物种描述与纳米出版物:将基于本体的技术应用于蜣螂(鞘翅目,金龟亚科)
Biodivers Data J. 2024 Jun 13;12:e121562. doi: 10.3897/BDJ.12.e121562. eCollection 2024.
2
Knowledge Representation and Management: Notable Contributions in 2021.知识表示与管理:2021 年的重要贡献。
Yearb Med Inform. 2022 Aug;31(1):236-240. doi: 10.1055/s-0042-1742523. Epub 2022 Dec 4.
3
SOCCOMAS: a FAIR web content management system that uses knowledge graphs and that is based on semantic programming.

本文引用的文献

1
Logical basis for morphological characters in phylogenetics.系统发育学中形态特征的逻辑基础。
Cladistics. 2007 Dec;23(6):565-587. doi: 10.1111/j.1096-0031.2007.00161.x.
2
The linguistic problem of morphology: structure versus homology and the standardization of morphological data.形态学的语言问题:结构与同源性以及形态学数据的标准化
Cladistics. 2010 Jun;26(3):301-325. doi: 10.1111/j.1096-0031.2009.00286.x. Epub 2009 Oct 7.
3
Assessing similarity: on homology, characters and the need for a semantic approach to non-evolutionary comparative homology.
SOCCOMAS:一个使用知识图谱且基于语义编程的 FAIR 网络内容管理系统。
Database (Oxford). 2019 Jan 1;2019. doi: 10.1093/database/baz067.
4
Organizing phenotypic data-a semantic data model for anatomy.组织表型数据——一种解剖学语义数据模型。
J Biomed Semantics. 2019 Jun 20;10(1):12. doi: 10.1186/s13326-019-0204-6.
评估相似性:论同源性、特征以及对非进化比较同源性采用语义学方法的必要性。
Cladistics. 2017 Oct;33(5):513-539. doi: 10.1111/cla.12179. Epub 2016 Nov 9.
4
The logical basis for coding ontologically dependent characters.对本体论相关特征进行编码的逻辑基础。
Cladistics. 2018 Aug;34(4):438-458. doi: 10.1111/cla.12209. Epub 2017 Jul 16.
5
Towards a semantic approach to numerical tree inference in phylogenetics.迈向系统发育学中数值树推断的语义方法。
Cladistics. 2018 Apr;34(2):200-224. doi: 10.1111/cla.12195. Epub 2017 Mar 10.
6
On beyond Gruber: "Ontologies" in today's biomedical information systems and the limits of OWL.超越格鲁伯:当今生物医学信息系统中的“本体论”与OWL的局限性
J Biomed Inform. 2019;100S:100002. doi: 10.1016/j.yjbinx.2019.100002. Epub 2019 Mar 9.
7
Evaluating FAIR maturity through a scalable, automated, community-governed framework.通过可扩展的、自动化的、社区管理的框架评估 FAIR 成熟度。
Sci Data. 2019 Sep 20;6(1):174. doi: 10.1038/s41597-019-0184-5.
8
SOCCOMAS: a FAIR web content management system that uses knowledge graphs and that is based on semantic programming.SOCCOMAS:一个使用知识图谱且基于语义编程的 FAIR 网络内容管理系统。
Database (Oxford). 2019 Jan 1;2019. doi: 10.1093/database/baz067.
9
Organizing phenotypic data-a semantic data model for anatomy.组织表型数据——一种解剖学语义数据模型。
J Biomed Semantics. 2019 Jun 20;10(1):12. doi: 10.1186/s13326-019-0204-6.
10
Levels and building blocks-toward a domain granularity framework for the life sciences.层次与构建模块——迈向生命科学领域粒度框架
J Biomed Semantics. 2019 Jan 28;10(1):4. doi: 10.1186/s13326-019-0196-2.