Kanadath Anusree, J Angel Arul Jothi, Urolagin Siddhaling
Department of Computer Science, Birla Institute of Technology and Science Pilani, Dubai Campus, Dubai International Academic City, 345055, Dubai, United Arab Emirates.
Department of Computer Science, Birla Institute of Technology and Science Pilani, Dubai Campus, Dubai International Academic City, 345055, Dubai, United Arab Emirates.
Comput Biol Med. 2025 Jul 15;196(Pt B):110680. doi: 10.1016/j.compbiomed.2025.110680.
Precise identification of object of interest (OoI) in histopathology images plays a vital role in cancer diagnosis and prognosis. Despite advances in digital pathology, detecting specific cellular structures within these images remains a significant challenge due to the inherent complexity and variability in cell morphology. Cellular structures exhibit similar visual characteristics, such as colors, shapes, and textures, making them difficult to distinguish from one another. Certain OoIs are much smaller than surrounding cells, rendering manual detection both challenging and error-prone. This paper introduces a hybrid vision transformer-based UNet (HVUNet) model, a novel approach designed to effectively identify and localize OoIs in histopathology images. To improve the detection in histopathology images, the proposed model incorporates UNet with vision transformers (ViTs) within an advanced encoder-decoder architecture. We evaluate HVUNet using the GZMH dataset, which includes histopathology images annotated for mitosis detection and the Lymphocyte detection (LD) dataset for lymphocyte cell detection. Through comprehensive experiments, we demonstrate that HVUNet notably surpasses several state-of-the-art models, including CNN variants, ViT-based models, and hybrid CNN-ViT architectures. Experimental results show that HVUNet outperforms traditional models such as UNet and recent advancements like UNETR and AttentionUNet, with a precision of 0.94, a recall of 0.60, and a F1-score of 0.72 for the GZMH dataset. Furthermore, HVUNet attained an Intersection over Union (IoU) score of 0.76 and a mean Average Precision (mAP) of 0.81, emphasizing its effectiveness in detecting mitotic cells. The model also achieved a F1-score of 0.76, an IoU of 0.63, and a mAP of 0.75, for the lymphocyte detection dataset demonstrating its effectiveness in detecting lymphocyte cells. To evaluate generalizability, we tested HVUNet on the MIDOG 2021 and PanopTILs datasets, observing competitive performance that demonstrated its robustness and broad applicability across diverse histopathology image analysis tasks.
在组织病理学图像中精确识别感兴趣的对象(OoI)在癌症诊断和预后中起着至关重要的作用。尽管数字病理学取得了进展,但由于细胞形态的内在复杂性和变异性,在这些图像中检测特定的细胞结构仍然是一项重大挑战。细胞结构表现出相似的视觉特征,如颜色、形状和纹理,使得它们难以相互区分。某些OoI比周围细胞小得多,这使得手动检测既具有挑战性又容易出错。本文介绍了一种基于混合视觉Transformer的UNet(HVUNet)模型,这是一种旨在有效识别和定位组织病理学图像中OoI的新方法。为了改进组织病理学图像中的检测,所提出的模型在先进的编码器-解码器架构中结合了带有视觉Transformer(ViT)的UNet。我们使用GZMH数据集评估HVUNet,该数据集包括注释用于有丝分裂检测的组织病理学图像以及用于淋巴细胞检测的淋巴细胞检测(LD)数据集。通过全面的实验,我们证明HVUNet明显优于几种先进的模型,包括CNN变体、基于ViT的模型以及混合CNN-ViT架构。实验结果表明,对于GZMH数据集,HVUNet优于传统模型如UNet以及像UNETR和AttentionUNet这样的最新进展,其精度为0.94,召回率为0.60,F1分数为0.72。此外,HVUNet的交并比(IoU)分数为0.76,平均精度均值(mAP)为0.81,强调了其在检测有丝分裂细胞方面的有效性。对于淋巴细胞检测数据集,该模型还实现了F1分数为0.76,IoU为0.63,mAP为为0.75,证明了其在检测淋巴细胞方面的有效性。为了评估通用性,我们在MIDOG 2021和PanopTILs数据集上测试了HVUNet,观察到具有竞争力的性能,证明了其在各种组织病理学图像分析任务中的稳健性和广泛适用性。