College of Electronic Information and Optical Engineering, Nankai University, Tianjin, 300350, China.
Neural Netw. 2024 Nov;179:106541. doi: 10.1016/j.neunet.2024.106541. Epub 2024 Jul 14.
Compressed Sensing (CS) is a groundbreaking paradigm in image acquisition, challenging the constraints of the Nyquist-Shannon sampling theorem. This enables high-quality image reconstruction using a minimal number of measurements. Neural Networks' potent feature induction capabilities enable advanced data-driven CS methods to achieve high-fidelity image reconstruction. However, achieving satisfactory reconstruction performance, particularly in terms of perceptual quality, remains challenging at extremely low sampling rates. To tackle this challenge, we introduce a novel two-stage image CS framework based on latent diffusion, named LD-CSNet. In the first stage, we utilize an autoencoder pre-trained on a large dataset to represent natural images as low-dimensional latent vectors, establishing prior knowledge distinct from sparsity and effectively reducing the dimensionality of the solution space. In the second stage, we employ a conditional diffusion model for maximum likelihood estimates in the latent space. This is supported by a measurement embedding module designed to encode measurements, making them suitable for a denoising network. This guides the generation process in reconstructing low-dimensional latent vectors. Finally, the image is reconstructed using a pre-trained decoder. Experimental results across multiple public datasets demonstrate LD-CSNet's superior perceptual quality and robustness to noise. It maintains fidelity and visual quality at lower sampling rates. Research findings suggest the promising application of diffusion models in image CS. Future research can focus on developing more appropriate models for the first stage.
压缩感知 (CS) 是图像获取领域的一项突破性范例,挑战了奈奎斯特-香农采样定理的限制。这使得仅使用少量测量值即可进行高质量的图像重建。神经网络强大的特征归纳能力使先进的数据驱动 CS 方法能够实现高保真度的图像重建。然而,在极低的采样率下,要实现令人满意的重建性能,尤其是在感知质量方面,仍然具有挑战性。为了解决这个挑战,我们引入了一种基于潜在扩散的新型两阶段图像 CS 框架,称为 LD-CSNet。在第一阶段,我们利用在大型数据集上预训练的自动编码器将自然图像表示为低维潜在向量,建立与稀疏性不同的先验知识,并有效地降低解决方案空间的维度。在第二阶段,我们在潜在空间中使用条件扩散模型进行最大似然估计。这得到了一个测量嵌入模块的支持,该模块旨在编码测量值,使其适合去噪网络。这指导了在重建低维潜在向量时的生成过程。最后,使用预训练的解码器重建图像。跨多个公共数据集的实验结果表明,LD-CSNet 在感知质量和抗噪性方面具有优势。它在较低的采样率下保持保真度和视觉质量。研究结果表明,扩散模型在图像 CS 中有很有前途的应用。未来的研究可以集中于为第一阶段开发更合适的模型。