一种用于多种无标记物体的稳健实时检测与跟踪框架。

A Robust Real-Time Detecting and Tracking Framework for Multiple Kinds of Unmarked Object.

机构信息

Beijing Advanced Innovation Center for Intelligent Robot and System, Beijing Institute of Technology, Beijing 100081, China.

School of Optical and Electronic Information, Huazhong University of Science and Technology, Wuhan 430074, China.

出版信息

Sensors (Basel). 2019 Dec 18;20(1):2. doi: 10.3390/s20010002.

DOI:10.3390/s20010002

PMID:31861254

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6982905/

Abstract

A rodent real-time tracking framework is proposed to automatically detect and track multi-objects in real time and output the coordinates of each object, which combines deep learning (YOLO v3: You Only Look Once, v3), the Kalman Filter, improved Hungarian algorithm, and the nine-point position correction algorithm. A model of a Rat-YOLO is trained in our experiment. The Kalman Filter model is established in an acceleration model to predict the position of the rat in the next frame. The predicted data is used to fill the losing position of rats if the Rat-YOLO doesn't work in the current frame, and to associate the ID between the last frame and current frame. The Hungarian assigned algorithm is used to show the relationship between the objects of the last frame and the objects of the current frame and match the ID of the objects. The nine-point position correction algorithm is presented to adjust the correctness of the Rat-YOLO result and the predicted results. As the training of deep learning needs more datasets than our experiment, and it is time-consuming to process manual marking, automatic software for generating labeled datasets is proposed under a fixed scene and the labeled datasets are manually verified in term of their correctness. Besides this, in an off-line experiment, a mask is presented to remove the highlight. In this experiment, we select the 500 frames of the data as the training datasets and label these images with the automatic label generating software. A video (of 2892 frames) is tested by the trained Rat model and the accuracy of detecting all the three rats is around 72.545%, however, the Rat-YOLO combining the Kalman Filter and nine-point position correction arithmetic improved the accuracy to 95.194%.

摘要

提出了一种啮齿动物实时跟踪框架，用于自动实时检测和跟踪多个物体，并输出每个物体的坐标。该框架结合了深度学习（YOLO v3：只看一次，v3）、卡尔曼滤波器、改进的匈牙利算法和九点位置校正算法。在我们的实验中，训练了一个 Rat-YOLO 模型。卡尔曼滤波器模型建立在加速度模型中，用于预测下一个帧中老鼠的位置。如果 Rat-YOLO 在当前帧中无法工作，则使用预测数据来填充老鼠的丢失位置，并在最后一帧和当前帧之间关联 ID。使用匈牙利分配算法显示上一帧和当前帧的对象之间的关系，并匹配对象的 ID。提出九点位置校正算法来调整 Rat-YOLO 结果和预测结果的正确性。由于深度学习的训练需要比我们的实验更多的数据集，并且手动标记处理很耗时，因此在固定场景下提出了自动生成标记数据集的软件，并手动验证标记数据集的正确性。此外，在离线实验中，提出了一种掩模来去除高亮。在这个实验中，我们选择了 500 帧数据作为训练数据集，并使用自动标记生成软件对这些图像进行标记。一个包含 2892 帧的视频由经过训练的 Rat 模型进行测试，检测所有三只老鼠的准确率约为 72.545%，但是，将卡尔曼滤波器和九点位置校正算法与 Rat-YOLO 结合使用后，准确率提高到了 95.194%。