Zhang Xiaoyu, Zhang Laixian, Guo Huichao, Zheng Haijing, Sun Houpeng, Li Yingchun, Li Rong, Luan Chenglong, Tong Xiaoyun
Graduate School, Space Engineering University, Beijing 101416, China.
Key Laboratory of Intelligent Space TTC&O Ministry of Education, Space Engineering University, Beijing 101416, China.
Sensors (Basel). 2025 Jan 24;25(3):697. doi: 10.3390/s25030697.
Laser active imaging systems can remedy the shortcomings of visible light imaging systems in difficult imaging circumstances, thereby attaining clear images. However, laser images exhibit significant modal discrepancy in contrast to the visible image, impeding human perception and computer processing. Consequently, it is necessary to translate laser images to visible images across modalities. Existing cross-modal image translation algorithms are plagued with issues, including difficult training and color bleeding. In recent studies, diffusion models have demonstrated superior image generation and translation abilities and been shown to be capable of generating high-quality images. To achieve more accurate laser-visible image translation, we designed an improved diffusion model, called DCLTV, which limits the randomness of diffusion models by means of dual-condition control. We incorporated the Brownian bridge strategy to serve as the first condition control and employed interpolation-based conditional injection to function as the second condition control. We also established a dataset comprising 665 pairs of laser-visible images to compensate for the data deficiency in the field of laser-visible image translation. Compared to five representative baseline models, namely Pix2pix, BigColor, CT2, ColorFormer, and DDColor, the proposed DCLTV achieved the best performance in terms of both qualitative and quantitative comparisons, realizing at least a 15.89% reduction in FID and at least a 22.02% reduction in LPIPS. We further validated the effectiveness of the dual conditions in DCLTV through ablation experiments, achieving the best results with an FID of 154.74 and an LPIPS of 0.379.
激光主动成像系统可以弥补可见光成像系统在困难成像环境下的缺点,从而获得清晰的图像。然而,与可见光图像相比,激光图像表现出显著的模态差异,这阻碍了人类感知和计算机处理。因此,有必要进行跨模态的激光图像到可见光图像的转换。现有的跨模态图像转换算法存在诸多问题,包括训练困难和颜色渗漏。在最近的研究中,扩散模型已展现出卓越的图像生成和转换能力,并被证明能够生成高质量的图像。为了实现更精确的激光-可见光图像转换,我们设计了一种改进的扩散模型,称为DCLTV,它通过双条件控制来限制扩散模型的随机性。我们引入布朗桥策略作为第一条件控制,并采用基于插值的条件注入作为第二条件控制。我们还建立了一个包含665对激光-可见光图像的数据集,以弥补激光-可见光图像转换领域的数据不足。与五个代表性的基线模型,即Pix2pix、BigColor、CT2、ColorFormer和DDColor相比,所提出的DCLTV在定性和定量比较方面均取得了最佳性能,实现了FID至少降低15.89%,LPIPS至少降低22.02%。我们通过消融实验进一步验证了DCLTV中双条件的有效性,获得了最佳结果,FID为154.74,LPIPS为0.379。