IEEE Trans Med Imaging. 2024 Apr;43(4):1388-1399. doi: 10.1109/TMI.2023.3337253. Epub 2024 Apr 3.
Fluorescence staining is an important technique in life science for labeling cellular constituents. However, it also suffers from being time-consuming, having difficulty in simultaneous labeling, etc. Thus, virtual staining, which does not rely on chemical labeling, has been introduced. Recently, deep learning models such as transformers have been applied to virtual staining tasks. However, their performance relies on large-scale pretraining, hindering their development in the field. To reduce the reliance on large amounts of computation and data, we construct a Swin-transformer model and propose an efficient supervised pretraining method based on the masked autoencoder (MAE). Specifically, we adopt downsampling and grid sampling to mask 75% of pixels and reduce the number of tokens. The pretraining time of our method is only 1/16 compared with the original MAE. We also design a supervised proxy task to predict stained images with multiple styles instead of masked pixels. Additionally, most virtual staining approaches are based on private datasets and evaluated by different metrics, making a fair comparison difficult. Therefore, we develop a standard benchmark based on three public datasets and build a baseline for the convenience of future researchers. We conduct extensive experiments on three benchmark datasets, and the experimental results show the proposed method achieves the best performance both quantitatively and qualitatively. In addition, ablation studies are conducted, and experimental results illustrate the effectiveness of the proposed pretraining method. The benchmark and code are available at https://github.com/birkhoffkiki/CAS-Transformer.
荧光染色是生命科学中用于标记细胞成分的一项重要技术。然而,它也存在耗时、难以同时进行标记等问题。因此,不依赖化学标记的虚拟染色技术已经被引入。最近,像 Transformer 这样的深度学习模型已经被应用于虚拟染色任务中。然而,它们的性能依赖于大规模的预训练,这阻碍了它们在该领域的发展。为了减少对大量计算和数据的依赖,我们构建了一个 Swin-Transformer 模型,并提出了一种基于掩蔽自动编码器(MAE)的高效监督预训练方法。具体来说,我们采用下采样和网格采样来掩蔽 75%的像素并减少标记的数量。与原始 MAE 相比,我们的方法的预训练时间仅为 1/16。我们还设计了一个监督代理任务,用于预测具有多种样式的染色图像,而不是掩蔽像素。此外,大多数虚拟染色方法都是基于私有数据集,并使用不同的指标进行评估,这使得公平比较变得困难。因此,我们基于三个公共数据集开发了一个标准基准,并为方便未来的研究人员构建了一个基线。我们在三个基准数据集上进行了广泛的实验,实验结果表明,所提出的方法在定量和定性方面都取得了最佳性能。此外,还进行了消融研究,实验结果表明了所提出的预训练方法的有效性。基准和代码可在 https://github.com/birkhoffkiki/CAS-Transformer 上获得。