Cortes Katherina G, Sundar Shilpa, Gehrke Sarah, Manpearl Keenan, Lin Junxia, Korn Daniel Robert, Caufield Harry, Schaper Kevin, Reese Justin, Koirala Kushal, Hunter Lawrence E, Carter E Kathleen, DeLuca Marcello, Krishnan Arjun, Mungall Chris, Haendel Melissa
Department of Biomedical Informatics, University of Colorado Anschutz Medical Campus.
Carolina Health Informatics Program, University of North Carolina at Chapel Hill.
ArXiv. 2025 Aug 29:arXiv:2508.21774v1.
Biomedical knowledge graphs (KGs) are widely used across research and translational settings, yet their design decisions and implementation are often opaque. Unlike ontologies that more frequently adhere to established creation principles, biomedical KGs lack consistent practices for construction, documentation, and dissemination. To address this gap, we introduce a set of evaluation criteria grounded in widely accepted data standards and principles from related fields. We apply these criteria to 16 biomedical KGs, revealing that even those that appear to align with best practices often obscure essential information required for external reuse. Moreover, biomedical KGs, despite pursuing similar goals and ingesting the same sources in some cases, display substantial variation in models, source integration, and terminology for node types. Reaping the potential benefits of knowledge graphs for biomedical research while reducing duplicated effort requires community-wide adoption of shared criteria and maturation of standards such as Biolink and KGX. Such improvements in transparency and standardization are essential for creating long-term reusability, improving comparability across resources, providing a rigorous foundation for artificial intelligence models, and enhancing the overall utility of KGs within biomedicine.
生物医学知识图谱(KGs)在研究和转化环境中被广泛使用,但其设计决策和实施往往不透明。与更频繁遵循既定创建原则的本体不同,生物医学知识图谱在构建、文档记录和传播方面缺乏一致的做法。为了弥补这一差距,我们引入了一套基于相关领域广泛接受的数据标准和原则的评估标准。我们将这些标准应用于16个生物医学知识图谱,发现即使是那些看似符合最佳实践的图谱,也常常掩盖了外部重用所需的基本信息。此外,生物医学知识图谱尽管在某些情况下追求相似的目标并摄取相同的来源,但在模型、源整合和节点类型的术语方面存在很大差异。要在减少重复工作的同时收获知识图谱对生物医学研究的潜在益处,需要整个社区采用共享标准,并使诸如生物链接(Biolink)和KGX等标准成熟起来。这种透明度和标准化的提高对于创造长期可重用性、提高资源间的可比性、为人工智能模型提供严格基础以及增强知识图谱在生物医学中的整体效用至关重要。