Department of Biomedical Informatics, University of Arkansas for Medical Sciences, Little Rock, Arkansas, United States of America.
Department of Health Outcomes and Biomedical Informatics, University of Florida, Gainesville, Florida, United States of America.
Phys Med Biol. 2022 Dec 23;68(1):014003. doi: 10.1088/1361-6560/ac9d1d.
The cancer imaging archive (TICA) receives and manages an ever-increasing quantity of clinical (non-image) data containing valuable information about subjects in imaging collections. To harmonize and integrate these data, we have first cataloged the types of information occurring across public TCIA collections. We then produced mappings for these diverse instance data using ontology-based representation patterns and transformed the data into a knowledge graph in a semantic database. This repository combined the transformed instance data with relevant background knowledge from domain ontologies. The resulting repository of semantically integrated data is a rich source of information about subjects that can be queried across imaging collections. Building on this work we have implemented and deployed a REST API and a user-facing semantic cohort builder tool. This tool allows allow researchers and other users to search and identify groups of subject-level records based on non-image data that were not queryable prior to this work. The search results produced by this interface link to images, allowing users to quickly identify and view images matching the selection criteria, as well as allowing users to export the harmonized clinical data.
癌症影像档案(TICA)接收和管理着越来越多的包含影像集内受试者有价值信息的临床(非影像)数据。为了协调和整合这些数据,我们首先对公共 TCIA 影像集中出现的信息类型进行了编目。然后,我们使用基于本体的表示模式为这些不同的实例数据制作了映射,并将数据转换为语义数据库中的知识图。该知识库将转换后的实例数据与来自领域本体的相关背景知识相结合。由此产生的语义集成数据存储库是一个关于受试者的丰富信息源,可以跨影像集进行查询。在此基础上,我们实现并部署了一个 REST API 和一个面向用户的语义队列构建工具。该工具允许研究人员和其他用户基于在此项工作之前无法查询的非影像数据搜索和识别受试者级记录的群组。该接口生成的搜索结果链接到图像,使用户能够快速识别和查看符合选择条件的图像,同时还允许用户导出协调后的临床数据。