Suppr超能文献

提升生物医学知识图谱质量:一种社区方法。

Improving Biomedical Knowledge Graph Quality: A Community Approach.

作者信息

Cortes Katherina G, Sundar Shilpa, Gehrke Sarah, Manpearl Keenan, Lin Junxia, Korn Daniel Robert, Caufield Harry, Schaper Kevin, Reese Justin, Koirala Kushal, Hunter Lawrence E, Carter E Kathleen, DeLuca Marcello, Krishnan Arjun, Mungall Chris, Haendel Melissa

机构信息

Department of Biomedical Informatics, University of Colorado Anschutz Medical Campus.

Carolina Health Informatics Program, University of North Carolina at Chapel Hill.

出版信息

ArXiv. 2025 Aug 29:arXiv:2508.21774v1.

Abstract

Biomedical knowledge graphs (KGs) are widely used across research and translational settings, yet their design decisions and implementation are often opaque. Unlike ontologies that more frequently adhere to established creation principles, biomedical KGs lack consistent practices for construction, documentation, and dissemination. To address this gap, we introduce a set of evaluation criteria grounded in widely accepted data standards and principles from related fields. We apply these criteria to 16 biomedical KGs, revealing that even those that appear to align with best practices often obscure essential information required for external reuse. Moreover, biomedical KGs, despite pursuing similar goals and ingesting the same sources in some cases, display substantial variation in models, source integration, and terminology for node types. Reaping the potential benefits of knowledge graphs for biomedical research while reducing duplicated effort requires community-wide adoption of shared criteria and maturation of standards such as Biolink and KGX. Such improvements in transparency and standardization are essential for creating long-term reusability, improving comparability across resources, providing a rigorous foundation for artificial intelligence models, and enhancing the overall utility of KGs within biomedicine.

摘要

生物医学知识图谱(KGs)在研究和转化环境中被广泛使用,但其设计决策和实施往往不透明。与更频繁遵循既定创建原则的本体不同,生物医学知识图谱在构建、文档记录和传播方面缺乏一致的做法。为了弥补这一差距,我们引入了一套基于相关领域广泛接受的数据标准和原则的评估标准。我们将这些标准应用于16个生物医学知识图谱,发现即使是那些看似符合最佳实践的图谱,也常常掩盖了外部重用所需的基本信息。此外,生物医学知识图谱尽管在某些情况下追求相似的目标并摄取相同的来源,但在模型、源整合和节点类型的术语方面存在很大差异。要在减少重复工作的同时收获知识图谱对生物医学研究的潜在益处,需要整个社区采用共享标准,并使诸如生物链接(Biolink)和KGX等标准成熟起来。这种透明度和标准化的提高对于创造长期可重用性、提高资源间的可比性、为人工智能模型提供严格基础以及增强知识图谱在生物医学中的整体效用至关重要。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6bd2/12407614/1c3d18af9bb0/nihpp-2508.21774v1-f0001.jpg

相似文献

3
Biolink Model: A universal schema for knowledge graphs in clinical, biomedical, and translational science.
Clin Transl Sci. 2022 Aug;15(8):1848-1855. doi: 10.1111/cts.13302. Epub 2022 Jun 6.
4
BioBLP: a modular framework for learning on multimodal biomedical knowledge graphs.
J Biomed Semantics. 2023 Dec 8;14(1):20. doi: 10.1186/s13326-023-00301-y.
6
Interventions to improve safe and effective medicines use by consumers: an overview of systematic reviews.
Cochrane Database Syst Rev. 2014 Apr 29;2014(4):CD007768. doi: 10.1002/14651858.CD007768.pub3.
8
MarkVCID cerebral small vessel consortium: I. Enrollment, clinical, fluid protocols.
Alzheimers Dement. 2021 Apr;17(4):704-715. doi: 10.1002/alz.12215. Epub 2021 Jan 21.

本文引用的文献

1
Construction, Deployment, and Usage of the Human Reference Atlas Knowledge Graph.
Sci Data. 2025 Jul 1;12(1):1100. doi: 10.1038/s41597-025-05183-6.
2
BioPortal: an open community resource for sharing, searching, and utilizing biomedical ontologies.
Nucleic Acids Res. 2025 Jul 7;53(W1):W84-W94. doi: 10.1093/nar/gkaf402.
5
Efficient reinterpretation of rare disease cases using Exomiser.
NPJ Genom Med. 2024 Dec 18;9(1):65. doi: 10.1038/s41525-024-00456-2.
7
Open Targets Platform: facilitating therapeutic hypotheses building in drug discovery.
Nucleic Acids Res. 2025 Jan 6;53(D1):D1467-D1475. doi: 10.1093/nar/gkae1128.
8
Ensembl 2025.
Nucleic Acids Res. 2025 Jan 6;53(D1):D948-D957. doi: 10.1093/nar/gkae1071.
9
Knowledge Graphs for drug repurposing: a review of databases and methods.
Brief Bioinform. 2024 Sep 23;25(6). doi: 10.1093/bib/bbae461.
10
Biomedical knowledge graph-optimized prompt generation for large language models.
Bioinformatics. 2024 Sep 2;40(9). doi: 10.1093/bioinformatics/btae560.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验