IEEE Trans Pattern Anal Mach Intell. 2021 Feb;43(2):567-578. doi: 10.1109/TPAMI.2019.2936841. Epub 2021 Jan 8.
We propose Neural Image Compression (NIC), a two-step method to build convolutional neural networks for gigapixel image analysis solely using weak image-level labels. First, gigapixel images are compressed using a neural network trained in an unsupervised fashion, retaining high-level information while suppressing pixel-level noise. Second, a convolutional neural network (CNN) is trained on these compressed image representations to predict image-level labels, avoiding the need for fine-grained manual annotations. We compared several encoding strategies, namely reconstruction error minimization, contrastive training and adversarial feature learning, and evaluated NIC on a synthetic task and two public histopathology datasets. We found that NIC can exploit visual cues associated with image-level labels successfully, integrating both global and local visual information. Furthermore, we visualized the regions of the input gigapixel images where the CNN attended to, and confirmed that they overlapped with annotations from human experts.
我们提出了神经图像压缩(NIC)方法,这是一种两步法,仅使用弱图像级标签即可构建用于分析千兆像素图像的卷积神经网络。首先,使用经过无监督方式训练的神经网络压缩千兆像素图像,在保留高级别信息的同时抑制像素级噪声。其次,在这些压缩的图像表示上训练卷积神经网络(CNN)以预测图像级标签,从而避免了对细粒度手动注释的需求。我们比较了几种编码策略,即重建误差最小化、对比训练和对抗特征学习,并在一个合成任务和两个公共组织病理学数据集上评估了 NIC。我们发现 NIC 可以成功利用与图像级标签相关的视觉线索,整合全局和局部视觉信息。此外,我们可视化了 CNN 关注的输入千兆像素图像的区域,并确认它们与人类专家的注释重叠。