Long Xianlei, Hu Shenhua, Hu Yiming, Gu Qingyi, Ishii Idaku
The Research Center of Precision Sensing and Control, Institute of Automation, Chinese Academy of Sciences, Beijing 100190, China.
The School of Artificial Intelligence, University of Chinese Academy of Sciences, Beijing 101408, China.
Sensors (Basel). 2019 Aug 26;19(17):3707. doi: 10.3390/s19173707.
An ultra-high-speed algorithm based on Histogram of Oriented Gradient (HOG) and Support Vector Machine (SVM) for hardware implementation at 10,000 frames per second (FPS) under complex backgrounds is proposed for object detection. The algorithm is implemented on the field-programmable gate array (FPGA) in the high-speed-vision platform, in which 64 pixels are input per clock cycle. The high pixel parallelism of the vision platform limits its performance, as it is difficult to reduce the strides between detection windows below 16 pixels, thus introduce non-negligible deviation of object detection. In addition, limited by the transmission bandwidth, only one frame in every four frames can be transmitted to PC for post-processing, that is, 75% image information is wasted. To overcome the mentioned problem, a multi-frame information fusion model is proposed in this paper. Image data and synchronization signals are first regenerated according to image frame numbers. The maximum HOG feature value and corresponding coordinates of each frame are stored in the bottom of the image with that of adjacent frames'. The compensated ones will be obtained through information fusion with the confidence of continuous frames. Several experiments are conducted to demonstrate the performance of the proposed algorithm. As the evaluation result shows, the deviation is reduced with our proposed method compared with the existing one.
提出了一种基于定向梯度直方图(HOG)和支持向量机(SVM)的超高速算法,用于在复杂背景下以每秒10000帧(FPS)的速度进行硬件实现的目标检测。该算法在高速视觉平台的现场可编程门阵列(FPGA)上实现,其中每个时钟周期输入64个像素。视觉平台的高像素并行性限制了其性能,因为很难将检测窗口之间的步长减小到16像素以下,从而引入了不可忽略的目标检测偏差。此外,受传输带宽限制,每四帧中只有一帧可以传输到PC进行后处理,即75%的图像信息被浪费。为了克服上述问题,本文提出了一种多帧信息融合模型。首先根据图像帧数重新生成图像数据和同步信号。将每一帧的最大HOG特征值及其对应坐标与相邻帧的特征值和坐标一起存储在图像底部。通过与连续帧的置信度进行信息融合,得到补偿后的结果。进行了几个实验来证明所提算法的性能。评估结果表明,与现有方法相比,我们提出的方法减小了偏差。