Shi Xin, Huang Qing, Xu Teng, Mei Hongwen, Quan Tingwei, Wang Xiuli, Shi Yinghan, Hu Ye, Duan Zhimei, Xie Fei, Li Sifan, Xie Lixin, Wang Kaifei
College of Pulmonary and Critical Care Medicine, Chinese PLA General Hospital, Beijing, China.
Chinese PLA Medical School, Beijing, China.
Sci Data. 2025 Jul 1;12(1):1074. doi: 10.1038/s41597-025-05452-4.
Bronchoalveolar lavage fluid (BALF) cytology provides an important basis for the diagnosis and treatment of lung diseases. Current cytological analysis of BALF relies on manual microscopic examination, which is time-consuming, laborious, and experience-dependent. Automated identification of BALF cytology helps increase the accuracy and speed of screening qualified samples and subsequent cytomorphology analysis. However, there is a lack of public clinical BALF cell datasets for the detection of different cell types and a lack of pixel-level annotations for cytomorphology analysis. In this work, high-resolution cell images from clinical bronchoalveolar lavage sample obtained at the Chinese PLA General Hospital from 2018-2024 were collected, and pixel-level high-quality instance annotations of seven cell types were labeled. In total, 2,105 clinical images were gathered, with 13,263 cells from seven distinct classes, via both contour fine labeling and bounding box labeling. The dataset was trained and tested by the YOLOv8 instance segmentation network. The results demonstrated that the dataset and model we provided are beneficial for the study of automated cell identification in BALF.
支气管肺泡灌洗液(BALF)细胞学检查为肺部疾病的诊断和治疗提供了重要依据。目前对BALF的细胞学分析依赖于人工显微镜检查,这种方法既耗时又费力,还依赖经验。BALF细胞学的自动识别有助于提高筛选合格样本的准确性和速度以及后续的细胞形态学分析。然而,缺乏用于检测不同细胞类型的公开临床BALF细胞数据集,也缺乏用于细胞形态学分析的像素级注释。在这项工作中,我们收集了2018年至2024年在中国人民解放军总医院获取的临床支气管肺泡灌洗样本的高分辨率细胞图像,并对七种细胞类型进行了像素级高质量实例注释。通过轮廓精细标注和边界框标注,总共收集了2105张临床图像,包含来自七个不同类别的13263个细胞。该数据集由YOLOv8实例分割网络进行训练和测试。结果表明,我们提供的数据集和模型有助于BALF中细胞自动识别的研究。