Brigham and Women's Hospital, Boston, MA, 02115, USA.
Florida State University, Tallahassee, FL, 32306, USA.
Med Phys. 2020 Nov;47(11):5953-5965. doi: 10.1002/mp.14445. Epub 2020 Sep 6.
The dataset contains annotations for lung nodules collected by the Lung Imaging Data Consortium and Image Database Resource Initiative (LIDC) stored as standard DICOM objects. The annotations accompany a collection of computed tomography (CT) scans for over 1000 subjects annotated by multiple expert readers, and correspond to "nodules ≥ 3 mm", defined as any lesion considered to be a nodule with greatest in-plane dimension in the range 3-30 mm regardless of presumed histology. The present dataset aims to simplify reuse of the data with the readily available tools, and is targeted towards researchers interested in the analysis of lung CT images.
Open source tools were utilized to parse the project-specific XML representation of LIDC-IDRI annotations and save the result as standard DICOM objects. Validation procedures focused on establishing compliance of the resulting objects with the standard, consistency of the data between the DICOM and project-specific representation, and evaluating interoperability with the existing tools.
The dataset utilizes DICOM Segmentation objects for storing annotations of the lung nodules, and DICOM Structured Reporting objects for communicating qualitative evaluations (nine attributes) and quantitative measurements (three attributes) associated with the nodules. The total of 875 subjects contain 6859 nodule annotations. Clustering of the neighboring annotations resulted in 2651 distinct nodules. The data are available in TCIA at https://doi.org/10.7937/TCIA.2018.h7umfurq.
The standardized dataset maintains the content of the original contribution of the LIDC-IDRI consortium, and should be helpful in developing automated tools for characterization of lung lesions and image phenotyping. In addition to those properties, the representation of the present dataset makes it more FAIR (Findable, Accessible, Interoperable, Reusable) for the research community, and enables its integration with other standardized data collections.
该数据集包含由肺部成像数据联盟和图像数据库资源倡议(LIDC-IDRI)收集的肺结节标注,这些标注存储为标准 DICOM 对象。这些标注与超过 1000 名受试者的计算机断层扫描(CT)扫描集合一起,这些标注由多个专家读者进行注释,对应于“结节≥3mm”,定义为任何在 3-30mm 范围内的平面内尺寸被认为是结节的病变,无论假定的组织学如何。本数据集旨在使用现成的工具简化数据的重复使用,并针对对肺部 CT 图像分析感兴趣的研究人员。
利用开源工具解析 LIDC-IDRI 标注的特定项目 XML 表示,并将结果保存为标准 DICOM 对象。验证程序侧重于确定生成对象与标准的一致性、DICOM 和特定项目表示之间数据的一致性,以及评估与现有工具的互操作性。
该数据集利用 DICOM 分割对象来存储肺结节的标注,利用 DICOM 结构化报告对象来传达与结节相关的定性评估(九个属性)和定量测量(三个属性)。总共 875 名受试者包含 6859 个结节标注。对相邻标注的聚类导致了 2651 个不同的结节。该数据可在 TCIA 上获取,网址为 https://doi.org/10.7937/TCIA.2018.h7umfurq。
标准化数据集保持了 LIDC-IDRI 联盟原始贡献的内容,应该有助于开发用于肺病变特征化和图像表型分析的自动化工具。除了这些特性之外,本数据集的表示使其对研究社区更加 FAIR(可发现、可访问、可互操作、可重复使用),并能够与其他标准化数据集集成。