• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

利用 Biomappings 预测和整理缺失的生物医学标识符映射。

Prediction and curation of missing biomedical identifier mappings with Biomappings.

机构信息

Laboratory of Systems Pharmacology, Harvard Medical School, Boston, MA 02115, United States.

Beth Israel Deaconess Medical Center, Boston, MA 02215, United States.

出版信息

Bioinformatics. 2023 Apr 3;39(4). doi: 10.1093/bioinformatics/btad130.

DOI:10.1093/bioinformatics/btad130
PMID:36916735
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10076045/
Abstract

MOTIVATION

Biomedical identifier resources (such as ontologies, taxonomies, and controlled vocabularies) commonly overlap in scope and contain equivalent entries under different identifiers. Maintaining mappings between these entries is crucial for interoperability and the integration of data and knowledge. However, there are substantial gaps in available mappings motivating their semi-automated curation.

RESULTS

Biomappings implements a curation workflow for missing mappings which combines automated prediction with human-in-the-loop curation. It supports multiple prediction approaches and provides a web-based user interface for reviewing predicted mappings for correctness, combined with automated consistency checking. Predicted and curated mappings are made available in public, version-controlled resource files on GitHub. Biomappings currently makes available 9274 curated mappings and 40 691 predicted ones, providing previously missing mappings between widely used identifier resources covering small molecules, cell lines, diseases, and other concepts. We demonstrate the value of Biomappings on case studies involving predicting and curating missing mappings among cancer cell lines as well as small molecules tested in clinical trials. We also present how previously missing mappings curated using Biomappings were contributed back to multiple widely used community ontologies.

AVAILABILITY AND IMPLEMENTATION

The data and code are available under the CC0 and MIT licenses at https://github.com/biopragmatics/biomappings.

摘要

动机

生物医学标识符资源(如本体、分类法和受控词汇表)在范围上通常重叠,并在不同的标识符下包含等效条目。维护这些条目的映射对于互操作性以及数据和知识的集成至关重要。然而,可用映射中存在大量差距,这促使我们对其进行半自动策展。

结果

Biomappings 实现了缺失映射的策展工作流程,该流程将自动化预测与人工参与的策展相结合。它支持多种预测方法,并提供了一个基于网络的用户界面,用于检查预测映射的正确性,同时结合自动化一致性检查。预测和策展的映射可在 GitHub 上的公共、版本控制的资源文件中获得。Biomappings 目前提供了 9274 条已策展的映射和 40691 条预测映射,提供了在广泛使用的标识符资源之间以前缺失的映射,这些资源涵盖小分子、细胞系、疾病和其他概念。我们通过癌症细胞系以及临床试验中测试的小分子之间缺失映射的预测和策展案例研究展示了 Biomappings 的价值。我们还介绍了使用 Biomappings 策展的以前缺失的映射如何被回馈到多个广泛使用的社区本体中。

可用性和实现

数据和代码可在 CC0 和 MIT 许可证下在 https://github.com/biopragmatics/biomappings 获得。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2076/10076045/105239456bd8/btad130f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2076/10076045/d0acece2fa68/btad130f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2076/10076045/5552ee6898e2/btad130f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2076/10076045/db7fbb89a770/btad130f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2076/10076045/105239456bd8/btad130f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2076/10076045/d0acece2fa68/btad130f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2076/10076045/5552ee6898e2/btad130f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2076/10076045/db7fbb89a770/btad130f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2076/10076045/105239456bd8/btad130f4.jpg

相似文献

1
Prediction and curation of missing biomedical identifier mappings with Biomappings.利用 Biomappings 预测和整理缺失的生物医学标识符映射。
Bioinformatics. 2023 Apr 3;39(4). doi: 10.1093/bioinformatics/btad130.
2
OntoBrowser: a collaborative tool for curation of ontologies by subject matter experts.OntoBrowser:一种供主题专家编辑本体的协作工具。
Bioinformatics. 2017 Jan 1;33(1):148-149. doi: 10.1093/bioinformatics/btw579. Epub 2016 Sep 6.
3
ComPath: an ecosystem for exploring, analyzing, and curating mappings across pathway databases.ComPath:一个用于探索、分析和管理途径数据库之间映射的生态系统。
NPJ Syst Biol Appl. 2018 Dec 13;5:3. doi: 10.1038/s41540-018-0078-8. eCollection 2019.
4
Unifying the identification of biomedical entities with the Bioregistry.将生物医学实体的识别与生物注册中心统一起来。
Sci Data. 2022 Nov 19;9(1):714. doi: 10.1038/s41597-022-01807-3.
5
Phylesystem: a git-based data store for community-curated phylogenetic estimates.系统发育体系:一个基于Git的用于社区策划系统发育估计的数据存储库。
Bioinformatics. 2015 Sep 1;31(17):2794-800. doi: 10.1093/bioinformatics/btv276. Epub 2015 May 4.
6
Doc2Hpo: a web application for efficient and accurate HPO concept curation.Doc2Hpo:一个用于高效准确的 HPO 概念编纂的网络应用程序。
Nucleic Acids Res. 2019 Jul 2;47(W1):W566-W570. doi: 10.1093/nar/gkz386.
7
Mining clinical attributes of genomic variants through assisted literature curation in Egas.通过在Egas中辅助文献编目挖掘基因组变异的临床属性。
Database (Oxford). 2016 Jun 7;2016. doi: 10.1093/database/baw096. Print 2016.
8
Composite annotations: requirements for mapping multiscale data and models to biomedical ontologies.复合注释:将多尺度数据和模型映射到生物医学本体的要求。
Annu Int Conf IEEE Eng Med Biol Soc. 2009;2009:2791-4. doi: 10.1109/IEMBS.2009.5333830.
9
BioPortal: ontologies and integrated data resources at the click of a mouse.生物门户:一键点击即可获取本体和集成数据资源。
Nucleic Acids Res. 2009 Jul;37(Web Server issue):W170-3. doi: 10.1093/nar/gkp440. Epub 2009 May 29.
10
Ci4SeR--curation interface for semantic resources--evaluation with adverse drug reactions.Ci4SeR——语义资源的管理界面——药物不良反应评估
Stud Health Technol Inform. 2014;205:116-20.

引用本文的文献

1
The Cell Ontology in the age of single-cell omics.单细胞组学时代的细胞本体论。
ArXiv. 2025 Jun 17:arXiv:2506.10037v2.
2
Digital evolution: Novo Nordisk's shift to ontology-based data management.数字进化:诺和诺德向基于本体的数据管理的转变。
J Biomed Semantics. 2025 Mar 22;16(1):6. doi: 10.1186/s13326-025-00327-4.
3
The text2term tool to map free-text descriptions of biomedical terms to ontologies.文本到术语工具,将生物医学术语的自由文本描述映射到本体上。

本文引用的文献

1
Automated assembly of molecular mechanisms at scale from text mining and curated databases.从文本挖掘和经过整理的数据库中大规模自动组装分子机制。
Mol Syst Biol. 2023 May 9;19(5):e11325. doi: 10.15252/msb.202211325. Epub 2023 Mar 20.
2
Gilda: biomedical entity text normalization with machine-learned disambiguation as a service.吉尔达:作为一种服务的、带有机器学习消歧功能的生物医学实体文本规范化。
Bioinform Adv. 2022 May 11;2(1):vbac034. doi: 10.1093/bioadv/vbac034. eCollection 2022.
3
Unifying the identification of biomedical entities with the Bioregistry.
Database (Oxford). 2024 Nov 28;2024. doi: 10.1093/database/baae119.
4
Implications of mappings between International Classification of Diseases clinical diagnosis codes and Human Phenotype Ontology terms.国际疾病分类临床诊断编码与人类表型本体术语之间映射的意义。
JAMIA Open. 2024 Nov 18;7(4):ooae118. doi: 10.1093/jamiaopen/ooae118. eCollection 2024 Dec.
5
The O3 guidelines: open data, open code, and open infrastructure for sustainable curated scientific resources.O3 指南:开放数据、开放代码和开放基础设施,以支持可持续的精选科学资源。
Sci Data. 2024 May 29;11(1):547. doi: 10.1038/s41597-024-03406-w.
6
The Human Phenotype Ontology in 2024: phenotypes around the world.2024 年人类表型本体:世界各地的表型。
Nucleic Acids Res. 2024 Jan 5;52(D1):D1333-D1346. doi: 10.1093/nar/gkad1005.
7
The DO-KB Knowledgebase: a 20-year journey developing the disease open science ecosystem.DO-KB 知识库:开发疾病开放科学生态系统的 20 年历程。
Nucleic Acids Res. 2024 Jan 5;52(D1):D1305-D1314. doi: 10.1093/nar/gkad1051.
将生物医学实体的识别与生物注册中心统一起来。
Sci Data. 2022 Nov 19;9(1):714. doi: 10.1038/s41597-022-01807-3.
4
Ontology Development Kit: a toolkit for building, maintaining and standardizing biomedical ontologies.本体开发工具包:用于构建、维护和标准化生物医学本体的工具包。
Database (Oxford). 2022 Oct 8;2022. doi: 10.1093/database/baac087.
5
Matching Biomedical Ontologies a Hybrid Graph Attention Network.匹配生物医学本体:一种混合图注意力网络
Front Genet. 2022 Jul 22;13:893409. doi: 10.3389/fgene.2022.893409. eCollection 2022.
6
TogoID: an exploratory ID converter to bridge biological datasets.TogoID:一种探索性的 ID 转换器,用于桥接生物数据集。
Bioinformatics. 2022 Sep 2;38(17):4194-4199. doi: 10.1093/bioinformatics/btac491.
7
A Simple Standard for Sharing Ontological Mappings (SSSOM).简单本体映射共享标准(SSSOM)。
Database (Oxford). 2022 May 25;2022. doi: 10.1093/database/baac035.
8
Author-sourced capture of pathway knowledge in computable form using Biofactoid.使用 Biofactoid 以可计算形式捕获作者来源的途径知识。
Elife. 2021 Dec 3;10:e68292. doi: 10.7554/eLife.68292.
9
The Human Disease Ontology 2022 update.人类疾病本体 2022 更新版。
Nucleic Acids Res. 2022 Jan 7;50(D1):D1255-D1261. doi: 10.1093/nar/gkab1063.
10
Crowdsourcing biocuration: The Community Assessment of Community Annotation with Ontologies (CACAO).众包生物注释:使用本体的社区注释评估 (CACAO)。
PLoS Comput Biol. 2021 Oct 28;17(10):e1009463. doi: 10.1371/journal.pcbi.1009463. eCollection 2021 Oct.