Lin Mingquan, Holste Gregory, Wang Song, Zhou Yiliang, Wei Yishu, Banerjee Imon, Chen Pengyi, Dai Tianjie, Du Yuexi, Dvornek Nicha C, Ge Yuyan, Guo Zuwei, Hanaoka Shouhei, Kim Dongkyun, Messina Pablo, Lu Yang, Parra Denis, Son Donghyun, Soto Álvaro, Urooj Aisha, Vidal René, Yamagishi Yosuke, Yan Pingkun, Yang Zefan, Zhang Ruichi, Zhou Yang, Celi Leo Anthony, Summers Ronald M, Lu Zhiyong, Chen Hao, Flanders Adam, Shih George, Wang Zhangyang, Peng Yifan
Department of Population Health Sciences, Weill Cornell Medicine, NY, USA; Department of Surgery, University of Minnesota, Minneapolis, USA.
Department of Electrical and Computer Engineering, The University of Texas at Austin, Austin, USA.
Med Image Anal. 2025 Jul 29;106:103739. doi: 10.1016/j.media.2025.103739.
The CXR-LT series is a community-driven initiative designed to enhance lung disease classification using chest X-rays (CXR). It tackles challenges in open long-tailed lung disease classification and enhances the measurability of state-of-the-art techniques. The first event, CXR-LT 2023, aimed to achieve these goals by providing high-quality benchmark CXR data for model development and conducting comprehensive evaluations to identify ongoing issues impacting lung disease classification performance. Building on the success of CXR-LT 2023, the CXR-LT 2024 expands the dataset to 377,110 chest X-rays (CXRs) and 45 disease labels, including 19 new rare disease findings. It also introduces a new focus on zero-shot learning to address limitations identified in the previous event. Specifically, CXR-LT 2024 features three tasks: (i) long-tailed classification on a large, noisy test set, (ii) long-tailed classification on a manually annotated "gold standard" subset, and (iii) zero-shot generalization to five previously unseen disease findings. This paper provides an overview of CXR-LT 2024, detailing the data curation process and consolidating state-of-the-art solutions, including the use of multimodal models for rare disease detection, advanced generative approaches to handle noisy labels, and zero-shot learning strategies for unseen diseases. Additionally, the expanded dataset enhances disease coverage to better represent real-world clinical settings, offering a valuable resource for future research. By synthesizing the insights and innovations of participating teams, we aim to advance the development of clinically realistic and generalizable diagnostic models for chest radiography.
CXR-LT系列是一项由社区推动的计划,旨在利用胸部X光(CXR)增强肺部疾病分类。它应对开放性长尾肺部疾病分类中的挑战,并提高现有技术的可测量性。首届活动CXR-LT 2023旨在通过为模型开发提供高质量的基准CXR数据并进行全面评估,以识别影响肺部疾病分类性能的现有问题,从而实现这些目标。基于CXR-LT 2023的成功,CXR-LT 2024将数据集扩展到377,110张胸部X光片(CXR)和45种疾病标签,包括19种新的罕见疾病发现。它还引入了对零样本学习的新关注,以解决上届活动中发现的局限性。具体而言,CXR-LT 2024具有三项任务:(i)在大型、有噪声的测试集上进行长尾分类,(ii)在人工标注的“金标准”子集上进行长尾分类,以及(iii)对五种以前未见过的疾病发现进行零样本泛化。本文概述了CXR-LT 2024,详细介绍了数据整理过程并整合了现有技术解决方案,包括使用多模态模型进行罕见疾病检测、处理噪声标签的先进生成方法以及针对未见过疾病的零样本学习策略。此外,扩展后的数据集扩大了疾病覆盖范围,以更好地代表现实世界的临床环境,为未来研究提供了宝贵资源。通过综合参与团队的见解和创新,我们旨在推动胸部X光临床现实且可推广的诊断模型的开发。