Pattern Recognition Lab, Computer Science, Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen, Germany.
Research and Development, EUROIMMUN Medizinische Labordiagnostika AG, Lübeck, Germany.
Sci Data. 2022 Jun 3;9(1):269. doi: 10.1038/s41597-022-01389-0.
Pulmonary hemorrhage (P-Hem) occurs among multiple species and can have various causes. Cytology of bronchoalveolar lavage fluid (BALF) using a 5-tier scoring system of alveolar macrophages based on their hemosiderin content is considered the most sensitive diagnostic method. We introduce a novel, fully annotated multi-species P-Hem dataset, which consists of 74 cytology whole slide images (WSIs) with equine, feline and human samples. To create this high-quality and high-quantity dataset, we developed an annotation pipeline combining human expertise with deep learning and data visualisation techniques. We applied a deep learning-based object detection approach trained on 17 expertly annotated equine WSIs, to the remaining 39 equine, 12 human and 7 feline WSIs. The resulting annotations were semi-automatically screened for errors on multiple types of specialised annotation maps and finally reviewed by a trained pathologist. Our dataset contains a total of 297,383 hemosiderophages classified into five grades. It is one of the largest publicly available WSIs datasets with respect to the number of annotations, the scanned area and the number of species covered.
肺出血(P-Hem)发生在多种物种中,可能有多种原因。基于肺泡巨噬细胞中含铁血黄素含量的 5 级评分系统对支气管肺泡灌洗液(BALF)进行细胞学检查被认为是最敏感的诊断方法。我们介绍了一种新颖的、完全注释的多物种 P-Hem 数据集,其中包含 74 张细胞学全幻灯片图像(WSI),包括马、猫和人类样本。为了创建这个高质量和高数量的数据集,我们开发了一个注释管道,将人类专业知识与深度学习和数据可视化技术相结合。我们将基于 17 张经过专业注释的马 WSI 训练的基于深度学习的目标检测方法应用于其余 39 张马、12 张人类和 7 张猫 WSI。生成的注释在多种类型的专用注释地图上进行半自动筛选,以纠正错误,并最终由经过培训的病理学家进行审查。我们的数据集共包含 297383 个分类为五个等级的含铁血黄素细胞。就注释数量、扫描面积和涵盖的物种数量而言,它是最大的公开可用的 WSI 数据集之一。