Kantipudi Karthik, Bui Vy, Yu Hang, Lure Y M Fleming, Jaeger Stefan, Yaniv Ziv
National Institute of Allergy and Infectious Diseases, National Institutes of Health, Bethesda, MD 20892, USA.
National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA.
Proc SPIE Int Soc Opt Eng. 2025 Feb;13407. doi: 10.1117/12.3047222. Epub 2025 Apr 4.
According to the 2023 World Health Organization report, an estimated 7.5 million people were diagnosed with tuberculosis (TB) in 2022. TB triaging is often performed using chest X-rays (CXRs), with significant efforts invested in automating this task using deep learning. A key concern with algorithms that output image-level labels, in our context TB/not-TB, is that they do not provide an explicit explanation with respect to how the output was obtained, limiting the ability of user oversight. Semantic segmentation of TB lesions can enable human supervision as part of the diagnosis process. This work presents a new dataset, TB-Portals SIFT, which enables semantic segmentation of TB lesions in CXRs (6,328 images with 10,435 pseudo-label lesion instances). Using this data, ten semantic segmentation models from the UNet and YOLOv8-seg architectures were evaluated in a five-fold cross validation study. The best performing segmentation models from each architecture, nnUNet(ResEnc XL) and YOLOv8m-seg and their ensemble were then evaluated for generalization on related classification and object detection tasks. Additionally, several binary DenseNet121 classifiers were trained, and their classification generalization performance was compared to that of the semantic segmentation-based classifier. Results show that the segmentation-based approach achieved better generalizability than the DenseNet121 classifiers and that the ensemble of the models from the two architectures was the most stable, closely matching or exceeding the performance of all other models across the tasks of segmentation, classification, and object detection. The dataset is publicly available from the NIAID TB Portals program after signing a data usage agreement which is available from https://tbportals.niaid.nih.gov/download-data.
根据2023年世界卫生组织的报告,2022年估计有750万人被诊断患有肺结核(TB)。肺结核分诊通常使用胸部X光片(CXR)进行,人们投入了大量精力利用深度学习实现这一任务的自动化。对于输出图像级标签(在我们的语境中即肺结核/非肺结核)的算法,一个关键问题是它们没有就输出是如何获得的给出明确解释,从而限制了用户监督的能力。肺结核病变的语义分割可以在诊断过程中实现人工监督。这项工作提出了一个新的数据集TB-Portals SIFT,它能够对胸部X光片中的肺结核病变进行语义分割(6328张图像,有10435个伪标签病变实例)。利用这些数据,在一项五折交叉验证研究中对来自UNet和YOLOv8-seg架构的10个语义分割模型进行了评估。然后对每种架构中表现最佳的分割模型nnUNet(ResEnc XL)和YOLOv8m-seg及其集成模型在相关分类和目标检测任务上的泛化能力进行了评估。此外,还训练了几个二元DenseNet121分类器,并将它们的分类泛化性能与基于语义分割的分类器进行了比较。结果表明,基于分割的方法比DenseNet121分类器具有更好的泛化能力,并且来自两种架构的模型集成是最稳定的,在分割、分类和目标检测任务中与所有其他模型的性能紧密匹配或超过了它们。该数据集在签署数据使用协议后可从美国国立过敏与传染病研究所(NIAID)的结核病门户计划公开获取,协议可从https://tbportals.niaid.nih.gov/download-data获取。