• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

简单本体映射共享标准(SSSOM)。

A Simple Standard for Sharing Ontological Mappings (SSSOM).

机构信息

Semanticly Ltd, London WC2H 9JQ, UK.

RENCI, University of North Carolina, Chapel Hill, NC 27517, USA.

出版信息

Database (Oxford). 2022 May 25;2022. doi: 10.1093/database/baac035.

DOI:10.1093/database/baac035
PMID:35616100
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9216545/
Abstract

Despite progress in the development of standards for describing and exchanging scientific information, the lack of easy-to-use standards for mapping between different representations of the same or similar objects in different databases poses a major impediment to data integration and interoperability. Mappings often lack the metadata needed to be correctly interpreted and applied. For example, are two terms equivalent or merely related? Are they narrow or broad matches? Or are they associated in some other way? Such relationships between the mapped terms are often not documented, which leads to incorrect assumptions and makes them hard to use in scenarios that require a high degree of precision (such as diagnostics or risk prediction). Furthermore, the lack of descriptions of how mappings were done makes it hard to combine and reconcile mappings, particularly curated and automated ones. We have developed the Simple Standard for Sharing Ontological Mappings (SSSOM) which addresses these problems by: (i) Introducing a machine-readable and extensible vocabulary to describe metadata that makes imprecision, inaccuracy and incompleteness in mappings explicit. (ii) Defining an easy-to-use simple table-based format that can be integrated into existing data science pipelines without the need to parse or query ontologies, and that integrates seamlessly with Linked Data principles. (iii) Implementing open and community-driven collaborative workflows that are designed to evolve the standard continuously to address changing requirements and mapping practices. (iv) Providing reference tools and software libraries for working with the standard. In this paper, we present the SSSOM standard, describe several use cases in detail and survey some of the existing work on standardizing the exchange of mappings, with the goal of making mappings Findable, Accessible, Interoperable and Reusable (FAIR). The SSSOM specification can be found at http://w3id.org/sssom/spec. Database URL: http://w3id.org/sssom/spec.

摘要

尽管在描述和交换科学信息的标准制定方面取得了进展,但在不同数据库中相同或相似对象的不同表示之间缺乏易于使用的映射标准,这仍然是数据集成和互操作性的主要障碍。映射通常缺乏正确解释和应用所需的元数据。例如,两个术语是等效的还是仅仅相关?它们是窄匹配还是宽匹配?或者它们以其他方式相关联?这些映射术语之间的关系通常没有记录,这导致了不正确的假设,并且在需要高精度的场景(例如诊断或风险预测)中难以使用。此外,由于缺乏对映射方式的描述,使得难以组合和协调映射,特别是经过人工整理和自动化的映射。我们开发了简单的共享本体映射标准(Simple Standard for Sharing Ontological Mappings,SSSOM),通过以下方式解决了这些问题:(i)引入机器可读和可扩展的词汇表来描述元数据,从而明确映射中的不精确性、不准确性和不完整性。(ii)定义一种易于使用的简单基于表格的格式,可以集成到现有的数据科学管道中,而无需解析或查询本体,并且与链接数据原则无缝集成。(iii)实现开放和社区驱动的协作工作流程,旨在不断发展标准,以满足不断变化的要求和映射实践。(iv)提供用于处理标准的参考工具和软件库。在本文中,我们介绍了 SSSOM 标准,详细描述了几个用例,并调查了一些关于标准化映射交换的现有工作,旨在使映射具有可发现性、可访问性、互操作性和可重用性(Findable, Accessible, Interoperable and Reusable,FAIR)。SSSOM 规范可以在 http://w3id.org/sssom/spec 找到。数据库地址:http://w3id.org/sssom/spec。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5538/9216545/3398e57e425e/baac035f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5538/9216545/105ccfd1df61/baac035f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5538/9216545/635e90ea1ec5/baac035f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5538/9216545/3398e57e425e/baac035f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5538/9216545/105ccfd1df61/baac035f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5538/9216545/635e90ea1ec5/baac035f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5538/9216545/3398e57e425e/baac035f3.jpg

相似文献

1
A Simple Standard for Sharing Ontological Mappings (SSSOM).简单本体映射共享标准(SSSOM)。
Database (Oxford). 2022 May 25;2022. doi: 10.1093/database/baac035.
2
Applying the FAIR principles to data in a hospital: challenges and opportunities in a pandemic.在医院的数据中应用 FAIR 原则:大流行中的挑战和机遇。
J Biomed Semantics. 2022 Apr 25;13(1):12. doi: 10.1186/s13326-022-00263-7.
3
Making Metadata Machine-Readable as the First Step to Providing Findable, Accessible, Interoperable, and Reusable Population Health Data: Framework Development and Implementation Study.将元数据转化为机器可读形式作为提供可查找、可访问、可互操作和可重用的人群健康数据的第一步:框架开发与实施研究
Online J Public Health Inform. 2024 Aug 1;16:e56237. doi: 10.2196/56237.
4
FAIR-compliant clinical, radiomics and DICOM metadata of RIDER, interobserver, Lung1 and head-Neck1 TCIA collections.符合 FAIR 原则的 RIDER、观察者间一致性、Lung1 和 head-Neck1 TCIA 数据集的临床、影像组学和 DICOM 元数据。
Med Phys. 2020 Nov;47(11):5931-5940. doi: 10.1002/mp.14322. Epub 2020 Jun 27.
5
linkedISA: semantic representation of ISA-Tab experimental metadata.linkedISA:ISA-Tab 实验元数据的语义表示。
BMC Bioinformatics. 2014;15 Suppl 14(Suppl 14):S4. doi: 10.1186/1471-2105-15-S14-S4. Epub 2014 Nov 27.
6
OBO Foundry in 2021: operationalizing open data principles to evaluate ontologies.2021 年的 OBO 基金会:运用开放数据原则来评估本体论。
Database (Oxford). 2021 Oct 26;2021. doi: 10.1093/database/baab069.
7
From Raw Data to FAIR Data: The FAIRification Workflow for Health Research.从原始数据到 FAIR 数据:健康研究的 FAIR 化工作流程。
Methods Inf Med. 2020 Jun;59(S 01):e21-e32. doi: 10.1055/s-0040-1713684. Epub 2020 Jul 3.
8
De-novo FAIRification via an Electronic Data Capture system by automated transformation of filled electronic Case Report Forms into machine-readable data.通过电子数据采集系统对填写好的电子病例报告表进行自动化转换,从而实现新的 FAIR 化,将其转化为机器可读的数据。
J Biomed Inform. 2021 Oct;122:103897. doi: 10.1016/j.jbi.2021.103897. Epub 2021 Aug 26.
9
Data sharing and ontology use among agricultural genetics, genomics, and breeding databases and resources of the Agbiodata Consortium.Agbiodata 联盟的农业遗传学、基因组学和育种数据库和资源的数据共享和本体使用。
Database (Oxford). 2023 Nov 15;2023. doi: 10.1093/database/baad076.
10
Adamant: a JSON schema-based metadata editor for research data management workflows.坚韧不拔:一个基于 JSON 模式的元数据编辑器,用于研究数据管理工作流程。
F1000Res. 2022 Apr 29;11:475. doi: 10.12688/f1000research.110875.2. eCollection 2022.

引用本文的文献

1
Discovery of optimal cell type classification marker genes from single cell RNA sequencing data.从单细胞RNA测序数据中发现最佳细胞类型分类标记基因。
BMC Methods. 2024;1. doi: 10.1186/s44330-024-00015-2. Epub 2024 Nov 4.
2
VO: The Vaccine Ontology.VO:疫苗本体论。
bioRxiv. 2025 Aug 15:2025.08.12.669998. doi: 10.1101/2025.08.12.669998.
3
The Cell Ontology in the age of single-cell omics.单细胞组学时代的细胞本体论。

本文引用的文献

1
Gilda: biomedical entity text normalization with machine-learned disambiguation as a service.吉尔达:作为一种服务的、带有机器学习消歧功能的生物医学实体文本规范化。
Bioinform Adv. 2022 May 11;2(1):vbac034. doi: 10.1093/bioadv/vbac034. eCollection 2022.
2
OBO Foundry in 2021: operationalizing open data principles to evaluate ontologies.2021 年的 OBO 基金会:运用开放数据原则来评估本体论。
Database (Oxford). 2021 Oct 26;2021. doi: 10.1093/database/baab069.
3
Biomedical Vocabulary Alignment at Scale in the UMLS Metathesaurus.统一医学语言系统(UMLS)元词表中的大规模生物医学词汇对齐
ArXiv. 2025 Jun 17:arXiv:2506.10037v2.
4
Standardizing Survey Data Collection to Enhance Reproducibility: Development and Comparative Evaluation of the ReproSchema Ecosystem.标准化调查数据收集以提高可重复性:ReproSchema生态系统的开发与比较评估
J Med Internet Res. 2025 Jul 11;27:e63343. doi: 10.2196/63343.
5
BioPortal: an open community resource for sharing, searching, and utilizing biomedical ontologies.生物门户:一个用于共享、搜索和利用生物医学本体的开放社区资源。
Nucleic Acids Res. 2025 Jul 7;53(W1):W84-W94. doi: 10.1093/nar/gkaf402.
6
OLS4: a new Ontology Lookup Service for a growing interdisciplinary knowledge ecosystem.OLS4:面向不断发展的跨学科知识生态系统的新型本体查找服务。
Bioinformatics. 2025 May 6;41(5). doi: 10.1093/bioinformatics/btaf279.
7
Suggestions for extending the FAIR Principles based on a linguistic perspective on semantic interoperability.基于语义互操作性语言视角的扩展FAIR原则的建议。
Sci Data. 2025 Apr 24;12(1):688. doi: 10.1038/s41597-025-05011-x.
8
Digital evolution: Novo Nordisk's shift to ontology-based data management.数字进化:诺和诺德向基于本体的数据管理的转变。
J Biomed Semantics. 2025 Mar 22;16(1):6. doi: 10.1186/s13326-025-00327-4.
9
The Unified Phenotype Ontology : a framework for cross-species integrative phenomics.统一表型本体论:跨物种综合表型组学的框架。
Genetics. 2025 Mar 17;229(3). doi: 10.1093/genetics/iyaf027.
10
A semantic approach to mapping the Provenance Ontology to Basic Formal Ontology.一种将来源本体映射到基础形式本体的语义方法。
Sci Data. 2025 Feb 17;12(1):282. doi: 10.1038/s41597-025-04580-1.
Proc Int World Wide Web Conf. 2021 Apr;2021:2672-2683. doi: 10.1145/3442381.3450128. Epub 2021 Apr 19.
4
BioHackathon 2015: Semantics of data for life sciences and reproducible research.2015 年生物黑客马拉松:生命科学和可重复研究的数据语义学。
F1000Res. 2020 Feb 24;9:136. doi: 10.12688/f1000research.18236.1. eCollection 2020.
5
The DisGeNET knowledge platform for disease genomics: 2019 update.DisGeNET 疾病基因组学知识平台:2019 年更新。
Nucleic Acids Res. 2020 Jan 8;48(D1):D845-D855. doi: 10.1093/nar/gkz1021.
6
Open Targets Platform: new developments and updates two years on.开放靶点平台:两年的新发展和更新。
Nucleic Acids Res. 2019 Jan 8;47(D1):D1056-D1065. doi: 10.1093/nar/gky1133.
7
OMIM.org: leveraging knowledge across phenotype-gene relationships.OMIM.org:利用表型-基因关系中的知识。
Nucleic Acids Res. 2019 Jan 8;47(D1):D1038-D1043. doi: 10.1093/nar/gky1151.
8
Mouse Genome Database (MGD) 2019.鼠标基因组数据库 (MGD) 2019.
Nucleic Acids Res. 2019 Jan 8;47(D1):D801-D806. doi: 10.1093/nar/gky1056.
9
The International Mouse Phenotyping Consortium (IMPC): a functional catalogue of the mammalian genome that informs conservation.国际小鼠表型分析联盟(IMPC):一份为保护工作提供信息的哺乳动物基因组功能目录。
Conserv Genet. 2018;19(4):995-1005. doi: 10.1007/s10592-018-1072-9. Epub 2018 May 19.
10
Identifiers for the 21st century: How to design, provision, and reuse persistent identifiers to maximize utility and impact of life science data.21世纪的标识符:如何设计、提供和复用持久标识符以最大化生命科学数据的效用和影响。
PLoS Biol. 2017 Jun 29;15(6):e2001414. doi: 10.1371/journal.pbio.2001414. eCollection 2017 Jun.