Choi Ho-Hyoung
School of Dentistry, Advanced Dental Device Development Institute, Kyungpook National University, Jung-gu, Daegu 41940, Republic of Korea.
Sensors (Basel). 2023 Jun 5;23(11):5341. doi: 10.3390/s23115341.
To achieve computer vision color constancy (CVCC), it is vital but challenging to estimate scene illumination from a digital image, which distorts the true color of an object. Estimating illumination as accurately as possible is fundamental to improving the quality of the image processing pipeline. CVCC has a long history of research and has significantly advanced, but it has yet to overcome some limitations such as algorithm failure or accuracy decreasing under unusual circumstances. To cope with some of the bottlenecks, this article presents a novel CVCC approach that introduces a residual-in-residual dense selective kernel network (RiR-DSN). As its name implies, it has a residual network in a residual network (RiR) and the RiR houses a dense selective kernel network (DSN). A DSN is composed of selective kernel convolutional blocks (SKCBs). The SKCBs, or neurons herein, are interconnected in a feed-forward fashion. Every neuron receives input from all its preceding neurons and feeds the feature maps into all its subsequent neurons, which is how information flows in the proposed architecture. In addition, the architecture has incorporated a dynamic selection mechanism into each neuron to ensure that the neuron can modulate filter kernel sizes depending on varying intensities of stimuli. In a nutshell, the proposed RiR-DSN architecture features neurons called SKCBs and a residual block in a residual block, which brings several benefits such as alleviation of the vanishing gradients, enhancement of feature propagation, promotion of the reuse of features, modulation of receptive filter sizes depending on varying intensities of stimuli, and a dramatic drop in the number of parameters. Experimental results highlight that the RiR-DSN architecture performs well above its state-of-the-art counterparts, as well as proving to be camera- and illuminant-invariant.
为实现计算机视觉颜色恒常性(CVCC),从数字图像中估计场景光照至关重要但具有挑战性,因为数字图像会扭曲物体的真实颜色。尽可能准确地估计光照是提高图像处理流水线质量的基础。CVCC有着悠久的研究历史且已取得显著进展,但它仍未克服一些局限性,比如在异常情况下算法失效或精度下降。为应对其中一些瓶颈,本文提出了一种新颖的CVCC方法,该方法引入了残差内残差密集选择性内核网络(RiR-DSN)。顾名思义,它在一个残差网络中有一个残差网络(RiR),并且RiR中包含一个密集选择性内核网络(DSN)。一个DSN由选择性内核卷积块(SKCB)组成。这些SKCB,即这里的神经元,以前馈方式相互连接。每个神经元接收其所有前序神经元的输入,并将特征图输入到其所有后续神经元,这就是信息在所提出的架构中流动的方式。此外,该架构在每个神经元中融入了动态选择机制,以确保神经元能够根据不同的刺激强度调整滤波器内核大小。简而言之,所提出的RiR-DSN架构具有名为SKCB的神经元以及残差块中的残差块,这带来了诸多好处,例如缓解梯度消失问题、增强特征传播、促进特征重用、根据不同的刺激强度调整感受野滤波器大小以及显著减少参数数量。实验结果表明,RiR-DSN架构的性能远高于其同类的现有技术,并且被证明具有相机和光源不变性。