Suppr超能文献

使用红外阵列传感器和激光雷达进行人员识别的比较研究。

A Comparison Study of Person Identification Using IR Array Sensors and LiDAR.

作者信息

Liu Kai, Bouazizi Mondher, Xing Zelin, Ohtsuki Tomoaki

机构信息

Graduate School of Science and Technology, Keio University, Yokohama 223-8522, Japan.

Faculty of Science and Technology, Keio University, Yokohama 223-8522, Japan.

出版信息

Sensors (Basel). 2025 Jan 6;25(1):271. doi: 10.3390/s25010271.

Abstract

Person identification is a critical task in applications such as security and surveillance, requiring reliable systems that perform robustly under diverse conditions. This study evaluates the Vision Transformer (ViT) and ResNet34 models across three modalities-RGB, thermal, and depth-using datasets collected with infrared array sensors and LiDAR sensors in controlled scenarios and varying resolutions (16 × 12 to 640 × 480) to explore their effectiveness in person identification. Preprocessing techniques, including YOLO-based cropping, were employed to improve subject isolation. Results show a similar identification performance between the three modalities, in particular in high resolution (i.e., 640 × 480), with RGB image classification reaching 100.0%, depth images reaching 99.54% and thermal images reaching 97.93%. However, upon deeper investigation, thermal images show more robustness and generalizability by maintaining focus on subject-specific features even at low resolutions. In contrast, RGB data performs well at high resolutions but exhibits reliance on background features as resolution decreases. Depth data shows significant degradation at lower resolutions, suffering from scattered attention and artifacts. These findings highlight the importance of modality selection, with thermal imaging emerging as the most reliable. Future work will explore multi-modal integration, advanced preprocessing, and hybrid architectures to enhance model adaptability and address current limitations. This study highlights the potential of thermal imaging and the need for modality-specific strategies in designing robust person identification systems.

摘要

人员识别是安全和监控等应用中的一项关键任务,需要可靠的系统在各种条件下都能稳健运行。本研究使用在受控场景中通过红外阵列传感器和激光雷达传感器收集的数据集,以三种模式(RGB、热成像和深度)评估视觉Transformer(ViT)和ResNet34模型,并采用不同分辨率(16×12至640×480)来探索它们在人员识别中的有效性。采用了包括基于YOLO的裁剪在内的预处理技术来改善目标隔离。结果表明,三种模式之间的识别性能相似,特别是在高分辨率(即640×480)下,RGB图像分类准确率达到100.0%,深度图像达到99.54%,热成像图像达到97.93%。然而,经过更深入的研究,热成像图像显示出更强的稳健性和通用性,即使在低分辨率下也能专注于特定目标的特征。相比之下,RGB数据在高分辨率下表现良好,但随着分辨率降低,对背景特征的依赖性增强。深度数据在较低分辨率下表现出显著退化,存在注意力分散和伪影问题。这些发现凸显了模式选择的重要性,热成像成为最可靠的模式。未来的工作将探索多模式集成、先进的预处理和混合架构,以提高模型的适应性并解决当前的局限性。本研究突出了热成像的潜力以及在设计稳健的人员识别系统中采用特定模式策略的必要性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/22fd/11723478/643871799ef3/sensors-25-00271-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验