IEEE Trans Image Process. 2017 Sep;26(9):4229-4242. doi: 10.1109/TIP.2017.2705426. Epub 2017 May 18.
Pedestrian detection in thermal infrared images poses unique challenges because of the low resolution and noisy nature of the image. Here, we propose a mid-level attribute in the form of the multidimensional template, or tensor, using local steering kernel (LSK) as low-level descriptors for detecting pedestrians in far infrared images. LSK is specifically designed to deal with intrinsic image noise and pixel level uncertainty by capturing local image geometry succinctly instead of collecting local orientation statistics (e.g., histograms in histogram of oriented gradients). In order to learn the LSK tensor, we introduce a new image similarity kernel following the popular maximum margin framework of support vector machines facilitating a relatively short and simple training phase for building a rigid pedestrian detector. Tensor representation has several advantages, and indeed, LSK templates allow exact acceleration of the sluggish but de facto sliding window-based detection methodology with multichannel discrete Fourier transform, facilitating very fast and efficient pedestrian localization. The experimental studies on publicly available thermal infrared images justify our proposals and model assumptions. In addition, the proposed work also involves the release of our in-house annotations of pedestrians in more than 17 000 frames of OSU color thermal database for the purpose of sharing with the research community.
热红外图像中的行人检测具有独特的挑战,因为图像的分辨率低且噪声大。在这里,我们提出了一种中层次的属性,即多维模板或张量,使用局部引导核(LSK)作为底层描述符来检测远红外图像中的行人。LSK 专门用于通过简洁地捕获局部图像几何结构来处理内在图像噪声和像素级不确定性,而不是收集局部方向统计信息(例如,方向梯度直方图中的直方图)。为了学习 LSK 张量,我们引入了一种新的图像相似性核,遵循支持向量机的流行最大间隔框架,为构建刚性行人检测器提供了相对较短且简单的训练阶段。张量表示具有多个优点,并且实际上,LSK 模板允许通过多通道离散傅里叶变换对缓慢但实际上基于滑动窗口的检测方法进行精确加速,从而实现非常快速和高效的行人定位。在公开的热红外图像上进行的实验研究证明了我们的建议和模型假设的合理性。此外,所提出的工作还涉及发布我们在俄勒冈州立大学彩色热数据库的 17000 多帧中行人的内部注释,以便与研究界共享。