NNF Center for Protein Research, Faculty of Health Sciences, University of Copenhagen, Copenhagen, Denmark.
Li-Ka Shing Big Data Institute, University of Oxford, Oxford, UK.
Nat Biotechnol. 2022 May;40(5):692-702. doi: 10.1038/s41587-021-01145-6. Epub 2022 Jan 31.
Implementing precision medicine hinges on the integration of omics data, such as proteomics, into the clinical decision-making process, but the quantity and diversity of biomedical data, and the spread of clinically relevant knowledge across multiple biomedical databases and publications, pose a challenge to data integration. Here we present the Clinical Knowledge Graph (CKG), an open-source platform currently comprising close to 20 million nodes and 220 million relationships that represent relevant experimental data, public databases and literature. The graph structure provides a flexible data model that is easily extendable to new nodes and relationships as new databases become available. The CKG incorporates statistical and machine learning algorithms that accelerate the analysis and interpretation of typical proteomics workflows. Using a set of proof-of-concept biomarker studies, we show how the CKG might augment and enrich proteomics data and help inform clinical decision-making.
实施精准医学的关键在于将组学数据(如蛋白质组学)整合到临床决策过程中,但生物医学数据的数量和多样性,以及与临床相关的知识在多个生物医学数据库和出版物中的传播,给数据整合带来了挑战。在这里,我们介绍临床知识图谱(Clinical Knowledge Graph,CKG),这是一个开源平台,目前包含近 2000 万个节点和 2.2 亿个关系,代表相关的实验数据、公共数据库和文献。该图谱结构提供了一个灵活的数据模型,很容易扩展到新的节点和关系,以适应新数据库的出现。CKG 集成了统计和机器学习算法,可加速典型蛋白质组学工作流程的分析和解释。通过一组概念验证生物标志物研究,我们展示了 CKG 如何增强和丰富蛋白质组学数据,并帮助为临床决策提供信息。