重新设计 DICOM 数据集编目工作流程。

Reengineering Workflow for Curation of DICOM Datasets.

机构信息

Department of Biomedical Informatics, UAMS, 4301 West Markham St, Little Rock, AR, 72205, USA.

Department of Radiation Oncology, Washington University School of Medicine, St. Louis, MO, USA.

出版信息

J Digit Imaging. 2018 Dec;31(6):783-791. doi: 10.1007/s10278-018-0097-4.

DOI:10.1007/s10278-018-0097-4

PMID:29907888

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6261183/

Abstract

Reusable, publicly available data is a pillar of open science and rapid advancement of cancer imaging research. Sharing data from completed research studies not only saves research dollars required to collect data, but also helps insure that studies are both replicable and reproducible. The Cancer Imaging Archive (TCIA) is a global shared repository for imaging data related to cancer. Insuring the consistency, scientific utility, and anonymity of data stored in TCIA is of utmost importance. As the rate of submission to TCIA has been increasing, both in volume and complexity of DICOM objects stored, the process of curation of collections has become a bottleneck in acquisition of data. In order to increase the rate of curation of image sets, improve the quality of the curation, and better track the provenance of changes made to submitted DICOM image sets, a custom set of tools was developed, using novel methods for the analysis of DICOM data sets. These tools are written in the programming language perl, use the open-source database PostgreSQL, make use of the perl DICOM routines in the open-source package Posda, and incorporate DICOM diagnostic tools from other open-source packages, such as dicom3tools. These tools are referred to as the "Posda Tools." The Posda Tools are open source and available via git at https://github.com/UAMS-DBMI/PosdaTools . In this paper, we briefly describe the Posda Tools and discuss the novel methods employed by these tools to facilitate rapid analysis of DICOM data, including the following: (1) use a database schema which is more permissive, and differently normalized from traditional DICOM databases; (2) perform integrity checks automatically on a bulk basis; (3) apply revisions to DICOM datasets on an bulk basis, either through a web-based interface or via command line executable perl scripts; (4) all such edits are tracked in a revision tracker and may be rolled back; (5) a UI is provided to inspect the results of such edits, to verify that they are what was intended; (6) identification of DICOM Studies, Series, and SOP instances using "nicknames" which are persistent and have well-defined scope to make expression of reported DICOM errors easier to manage; and (7) rapidly identify potential duplicate DICOM datasets by pixel data is provided; this can be used, e.g., to identify submission subjects which may relate to the same individual, without identifying the individual.

摘要

可重复使用且公开可用的数据是开放科学和癌症成像研究快速发展的基础。共享已完成研究的数据不仅可以节省收集数据所需的研究资金，还可以确保研究具有可重复性和可再现性。癌症成像档案 (TCIA) 是一个全球性的成像数据共享存储库，用于癌症相关研究。确保 TCIA 中存储的数据的一致性、科学实用性和匿名性至关重要。随着 TCIA 提交的数据量和存储的 DICOM 对象的复杂性不断增加，数据采集的过程已成为策展的瓶颈。为了提高图像集策展的速度，提高策展的质量，并更好地跟踪对提交的 DICOM 图像集所做更改的出处，我们开发了一组自定义工具，使用分析 DICOM 数据集的新方法。这些工具使用 Perl 编程语言编写，使用开源数据库 PostgreSQL，使用开源包 Posda 中的 perl DICOM 例程，并结合其他开源包（如 dicom3tools）中的 DICOM 诊断工具。这些工具被称为“Posda 工具”。Posda 工具是开源的，可以通过 https://github.com/UAMS-DBMI/PosdaTools 从 Git 上获取。在本文中，我们简要描述了 Posda 工具，并讨论了这些工具所采用的新颖方法，以促进对 DICOM 数据的快速分析，包括以下内容：(1) 使用更宽松且不同于传统 DICOM 数据库的数据库模式；(2) 自动批量执行完整性检查；(3) 通过基于 Web 的界面或通过命令行可执行 perl 脚本批量应用对 DICOM 数据集的修订；(4) 在修订跟踪器中跟踪所有此类编辑，并可以回滚；(5) 提供一个用户界面来检查此类编辑的结果，以验证它们是否符合预期；(6) 使用“昵称”标识 DICOM 研究、系列和 SOP 实例，这些“昵称”是持久的，并且具有明确定义的范围，从而更易于管理报告的 DICOM 错误；(7) 通过像素数据快速识别潜在的重复 DICOM 数据集；例如，可以使用它来识别可能与同一人相关的提交主题，而无需识别个人。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/00dd/6261183/05392619ff07/10278_2018_97_Fig1_HTML.jpg

相似文献

Reengineering Workflow for Curation of DICOM Datasets.重新设计 DICOM 数据集编目工作流程。

J Digit Imaging. 2018 Dec;31(6):783-791. doi: 10.1007/s10278-018-0097-4.

FAIR-compliant clinical, radiomics and DICOM metadata of RIDER, interobserver, Lung1 and head-Neck1 TCIA collections.符合 FAIR 原则的 RIDER、观察者间一致性、Lung1 和 head-Neck1 TCIA 数据集的临床、影像组学和 DICOM 元数据。

Med Phys. 2020 Nov;47(11):5931-5940. doi: 10.1002/mp.14322. Epub 2020 Jun 27.

A DICOM dataset for evaluation of medical image de-identification.用于医学图像去识别评估的 DICOM 数据集。

Sci Data. 2021 Jul 16;8(1):183. doi: 10.1038/s41597-021-00967-y.

DICOM for quantitative imaging biomarker development: a standards based approach to sharing clinical data and structured PET/CT analysis results in head and neck cancer research.用于定量成像生物标志物开发的DICOM：一种基于标准的方法，用于在头颈癌研究中共享临床数据和结构化PET/CT分析结果。

PeerJ. 2016 May 24;4:e2057. doi: 10.7717/peerj.2057. eCollection 2016.

Informatics in radiology: use of CouchDB for document-based storage of DICOM objects.放射学中的信息学：使用 CouchDB 对基于文档的 DICOM 对象进行存储。

Radiographics. 2012 May-Jun;32(3):913-27. doi: 10.1148/rg.323115049. Epub 2012 Mar 8.

TCIApathfinder: An R Client for the Cancer Imaging Archive REST API.TCIApathfinder：用于癌症成像档案 REST API 的 R 客户端。

Cancer Res. 2018 Aug 1;78(15):4424-4426. doi: 10.1158/0008-5472.CAN-18-0678. Epub 2018 Jun 5.

Radtools: R utilities for convenient extraction of medical image metadata.Radtools：用于便捷提取医学图像元数据的R实用工具。

F1000Res. 2018 Dec 24;7. doi: 10.12688/f1000research.17139.3. eCollection 2018.

DICOM for Clinical Research: PACS-Integrated Electronic Data Capture in Multi-Center Trials.用于临床研究的DICOM：多中心试验中集成PACS的电子数据采集

J Digit Imaging. 2015 Oct;28(5):558-66. doi: 10.1007/s10278-015-9802-8.

DICOM re-encoding of volumetrically annotated Lung Imaging Database Consortium (LIDC) nodules.DICOM 重新编码容积标注的肺癌影像数据库联盟（LIDC）结节。

Med Phys. 2020 Nov;47(11):5953-5965. doi: 10.1002/mp.14445. Epub 2020 Sep 6.

The public cancer radiology imaging collections of The Cancer Imaging Archive.癌症影像档案公共癌症放射影像学数据库。

Sci Data. 2017 Sep 19;4:170124. doi: 10.1038/sdata.2017.124.

引用本文的文献

Linking and GenBank to the National Clinical Cohort Collaborative.将Linking和GenBank与国家临床队列协作项目相连接。

Learn Health Syst. 2024 Sep 12;9(1):e10457. doi: 10.1002/lrh2.10457. eCollection 2025 Jan.

Summary of the National Cancer Institute 2023 Virtual Workshop on Medical Image De-identification-Part 1: Report of the MIDI Task Group - Best Practices and Recommendations, Tools for Conventional Approaches to De-identification, International Approaches to De-identification, and Industry Panel on Image De-identification.美国国立癌症研究所2023年医学图像去识别化虚拟研讨会总结 - 第1部分：MIDI任务组报告 - 最佳实践与建议、传统去识别化方法的工具、国际去识别化方法以及图像去识别化行业小组

J Imaging Inform Med. 2025 Feb;38(1):1-15. doi: 10.1007/s10278-024-01182-y. Epub 2024 Jul 12.

Summary of the National Cancer Institute 2023 Virtual Workshop on Medical Image De-identification-Part 2: Pathology Whole Slide Image De-identification, De-facing, the Role of AI in Image De-identification, and the NCI MIDI Datasets and Pipeline.美国国立癌症研究所2023年医学图像去识别化虚拟研讨会总结 - 第二部分：病理学全切片图像去识别化、面部去除、人工智能在图像去识别化中的作用以及美国国立癌症研究所MIDI数据集和流程

J Imaging Inform Med. 2025 Feb;38(1):16-30. doi: 10.1007/s10278-024-01183-x. Epub 2024 Jul 9.

Data infrastructures for AI in medical imaging: a report on the experiences of five EU projects.人工智能在医学成像中的数据基础设施：五个欧盟项目经验报告。

Eur Radiol Exp. 2023 May 8;7(1):20. doi: 10.1186/s41747-023-00336-x.

Report of the Medical Image De-Identification (MIDI) Task Group -- Best Practices and Recommendations.医学图像去识别化（MIDI）任务组报告——最佳实践与建议

ArXiv. 2025 Mar 16:arXiv:2303.10473v3.

Artificial Intelligence for Radiation Oncology Applications Using Public Datasets.人工智能在放射肿瘤学中的应用：基于公共数据集

Semin Radiat Oncol. 2022 Oct;32(4):400-414. doi: 10.1016/j.semradonc.2022.06.009.

Semantic Integration of Multi-Modal Data and Derived Neuroimaging Results Using the Platform for Imaging in Precision Medicine (PRISM) in the Arkansas Imaging Enterprise System (ARIES).在阿肯色州成像企业系统（ARIES）中，使用精准医学成像平台（PRISM）对多模态数据和衍生神经成像结果进行语义整合。

Front Artif Intell. 2022 Feb 10;4:649970. doi: 10.3389/frai.2021.649970. eCollection 2021.

A DICOM dataset for evaluation of medical image de-identification.用于医学图像去识别评估的 DICOM 数据集。

Sci Data. 2021 Jul 16;8(1):183. doi: 10.1038/s41597-021-00967-y.

Research Goal-Driven Data Model and Harmonization for De-Identifying Patient Data in Radiomics.研究目标驱动的数据模型与放射组学中去识别患者数据的协调

J Digit Imaging. 2021 Aug;34(4):986-1004. doi: 10.1007/s10278-021-00476-9. Epub 2021 Jul 9.

API Driven On-Demand Participant ID Pseudonymization in Heterogeneous Multi-Study Research.异构多研究中的应用程序编程接口驱动的按需参与者身份假名化

Healthc Inform Res. 2021 Jan;27(1):39-47. doi: 10.4258/hir.2021.27.1.39. Epub 2021 Jan 31.

本文引用的文献

How Will Big Data Improve Clinical and Basic Research in Radiation Therapy?大数据将如何改善放射治疗的临床和基础研究？

Int J Radiat Oncol Biol Phys. 2016 Jul 1;95(3):895-904. doi: 10.1016/j.ijrobp.2015.11.009. Epub 2015 Nov 11.

MIRMAID: A Content Management System for Medical Image Analysis Research.MIRMAID：一种用于医学图像分析研究的内容管理系统。

Radiographics. 2015 Sep-Oct;35(5):1461-8. doi: 10.1148/rg.2015140031. Epub 2015 Aug 18.

TCIA: An information resource to enable open science.TCIA：一个助力开放科学的信息资源。

Annu Int Conf IEEE Eng Med Biol Soc. 2013;2013:1282-5. doi: 10.1109/EMBC.2013.6609742.

The Cancer Imaging Archive (TCIA): maintaining and operating a public information repository.癌症影像档案库（TCIA）：维护和运营公共信息知识库。

J Digit Imaging. 2013 Dec;26(6):1045-57. doi: 10.1007/s10278-013-9622-7.

Image data sharing for biomedical research--meeting HIPAA requirements for De-identification.用于生物医学研究的图像数据共享--满足 HIPAA 对去识别化的要求。

J Digit Imaging. 2012 Feb;25(1):14-24. doi: 10.1007/s10278-011-9422-x.

Mastering DICOM with DVTk.使用DVTk掌握DICOM。

J Digit Imaging. 2007 Nov;20 Suppl 1(Suppl 1):47-62. doi: 10.1007/s10278-007-9057-0. Epub 2007 Aug 7.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

重新设计 DICOM 数据集编目工作流程。

Reengineering Workflow for Curation of DICOM Datasets.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献