基于 RGB 和热成像的消费者视觉跟踪和表观人口估计的新基准。

A New Benchmark for Consumer Visual Tracking and Apparent Demographic Estimation from RGB and Thermal Images.

机构信息

Department of Computer Science and Engineering (CSE), University of Ioannina, 45110 Ioannina, Greece.

Institute for Language and Speech Processing (ILSP), Athena Research and Innovation Center, 15125 Athens, Greece.

出版信息

Sensors (Basel). 2023 Nov 29;23(23):9510. doi: 10.3390/s23239510.

DOI:10.3390/s23239510

PMID:38067883

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10708599/

Abstract

Visual tracking and attribute estimation related to age or gender information of multiple person entities in a scene are mature research topics with the advent of deep learning techniques. However, when it comes to indoor images such as video sequences of retail consumers, data are not always adequate or accurate enough to essentially train effective models for consumer detection and tracking under various adverse factors. This in turn affects the quality of recognizing age or gender for those detected instances. In this work, we introduce two novel datasets: comprises 145 video sequences compliant to personal information regulations as far as facial images are concerned and is a set of cropped body images from each sequence that can be used for numerous computer vision tasks. We also propose an end-to-end framework which comprises CNNs as object detectors, LSTMs for motion forecasting of the tracklet association component in a sequence, along with a multi-attribute classification model for apparent demographic estimation of the detected outputs, aiming to capture useful metadata of consumer product preferences. Obtained results on tracking and age/gender prediction are promising with respect to reference systems while they indicate the proposed model's potential for practical consumer metadata extraction.

摘要

在场景中对多个人体实体的年龄或性别相关的视觉跟踪和属性估计是一个成熟的研究课题，随着深度学习技术的出现。然而，当涉及到室内图像，如零售消费者的视频序列时，数据并不总是足够充足或准确，无法为消费者检测和跟踪各种不利因素下的有效模型提供基本训练。这反过来又影响了对检测到的实例的年龄或性别识别的质量。在这项工作中，我们引入了两个新的数据集：[数据集 1] 包含 145 个视频序列，这些序列符合个人信息法规，就面部图像而言，[数据集 2] 是从每个序列裁剪出的身体图像，可用于许多计算机视觉任务。我们还提出了一个端到端框架，该框架包括作为目标检测器的 CNN、用于序列中轨迹关联组件的运动预测的 LSTM，以及用于明显的人口统计学估计的多属性分类模型，旨在捕获消费者产品偏好的有用元数据。与参考系统相比，在跟踪和年龄/性别预测方面的结果是有希望的，这表明了所提出的模型在实际消费者元数据提取方面的潜力。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ece2/10708599/e0a5af7ee7ca/sensors-23-09510-g001.jpg

相似文献

A New Benchmark for Consumer Visual Tracking and Apparent Demographic Estimation from RGB and Thermal Images.

Sensors (Basel). 2023 Nov 29;23(23):9510. doi: 10.3390/s23239510.

Identity-Preserved Human Posture Detection in Infrared Thermal Images: A Benchmark.

Sensors (Basel). 2022 Dec 22;23(1):92. doi: 10.3390/s23010092.

Review: Single attribute and multi attribute facial gender and age estimation.

Multimed Tools Appl. 2023;82(1):1289-1311. doi: 10.1007/s11042-022-12678-6. Epub 2022 Jun 15.

Event-Based Motion Capture System for Online Multi-Quadrotor Localization and Tracking.

Sensors (Basel). 2022 Apr 23;22(9):3240. doi: 10.3390/s22093240.

Real-Time 3D Facial Tracking via Cascaded Compositional Learning.

IEEE Trans Image Process. 2021;30:3844-3857. doi: 10.1109/TIP.2021.3065819. Epub 2021 Mar 25.

Deep Attention Models for Human Tracking Using RGBD.

Sensors (Basel). 2019 Feb 13;19(4):750. doi: 10.3390/s19040750.

Tracklet Association by Online Target-Specific Metric Learning and Coherent Dynamics Estimation.

IEEE Trans Pattern Anal Mach Intell. 2017 Mar;39(3):589-602. doi: 10.1109/TPAMI.2016.2551245. Epub 2016 Apr 6.

Tracking by segmentation with future motion estimation applied to person-following robots.

Front Neurorobot. 2023 Aug 28;17:1255085. doi: 10.3389/fnbot.2023.1255085. eCollection 2023.

Deep Learning Driven Visual Path Prediction From a Single Image.

IEEE Trans Image Process. 2016 Dec;25(12):5892-5904. doi: 10.1109/TIP.2016.2613686. Epub 2016 Sep 26.

Boosting Multi-Vehicle Tracking with a Joint Object Detection and Viewpoint Estimation Sensor.

Sensors (Basel). 2019 Sep 20;19(19):4062. doi: 10.3390/s19194062.

本文引用的文献

Apparent age prediction from faces: A survey of modern approaches.

Front Big Data. 2022 Oct 26;5:1025806. doi: 10.3389/fdata.2022.1025806. eCollection 2022.

VoxelTrack: Multi-Person 3D Human Pose Estimation and Tracking in the Wild.

IEEE Trans Pattern Anal Mach Intell. 2023 Feb;45(2):2613-2626. doi: 10.1109/TPAMI.2022.3163709. Epub 2023 Jan 6.

A Multifeature Learning and Fusion Network for Facial Age Estimation.

Sensors (Basel). 2021 Jul 5;21(13):4597. doi: 10.3390/s21134597.

Wearable Sensor-Based Gait Analysis for Age and Gender Estimation.

Sensors (Basel). 2020 Apr 24;20(8):2424. doi: 10.3390/s20082424.

Deep Multi-View Enhancement Hashing for Image Retrieval.

IEEE Trans Pattern Anal Mach Intell. 2021 Apr;43(4):1445-1451. doi: 10.1109/TPAMI.2020.2975798. Epub 2021 Mar 4.

Attended End-to-end Architecture for Age Estimation from Facial Expression Videos.

IEEE Trans Image Process. 2019 Oct 24. doi: 10.1109/TIP.2019.2948288.

Deep Differentiable Random Forests for Age Estimation.

IEEE Trans Pattern Anal Mach Intell. 2021 Feb;43(2):404-419. doi: 10.1109/TPAMI.2019.2937294. Epub 2021 Jan 8.

Squeeze-and-Excitation Networks.

IEEE Trans Pattern Anal Mach Intell. 2020 Aug;42(8):2011-2023. doi: 10.1109/TPAMI.2019.2913372. Epub 2019 Apr 29.

Gender Recognition from Human-Body Images Using Visible-Light and Thermal Camera Videos Based on a Convolutional Neural Network for Image Feature Extraction.

Sensors (Basel). 2017 Mar 20;17(3):637. doi: 10.3390/s17030637.

Enhanced Gender Recognition System Using an Improved Histogram of Oriented Gradient (HOG) Feature from Quality Assessment of Visible Light and Thermal Images of the Human Body.

Sensors (Basel). 2016 Jul 21;16(7):1134. doi: 10.3390/s16071134.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

基于 RGB 和热成像的消费者视觉跟踪和表观人口估计的新基准。

A New Benchmark for Consumer Visual Tracking and Apparent Demographic Estimation from RGB and Thermal Images.

机构信息

出版信息

相似文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

本文引用的文献