Suppr超能文献

基于帧采样-随机擦除和互信息-时间权重聚合的视频行人再识别。

Video Person Re-Identification with Frame Sampling-Random Erasure and Mutual Information-Temporal Weight Aggregation.

机构信息

Information and Communication Engineering, Electronics Information Engineering College, Changchun University of Science and Technology, Changchun 130022, China.

High-Speed Railway Comprehensive Technical College, Jilin Railway Technology College, Jilin 132299, China.

出版信息

Sensors (Basel). 2022 Apr 15;22(8):3047. doi: 10.3390/s22083047.

Abstract

Partial occlusion and background clutter in camera video surveillance affect the accuracy of video-based person re-identification (re-ID). To address these problems, we propose a person re-ID method based on random erasure of frame sampling and temporal weight aggregation of mutual information of partial and global features. First, for the case in which the target person is interfered or partially occluded, the frame sampling-random erasure (FSE) method is used for data enhancement to effectively alleviate the occlusion problem, improve the generalization ability of the model, and match persons more accurately. Second, to further improve the re-ID accuracy of video-based persons and learn more discriminative feature representations, we use a ResNet-50 network to extract global and partial features and fuse these features to obtain frame-level features. In the time dimension, based on a mutual information-temporal weight aggregation (MI-TWA) module, the partial features are added according to different weights and the global features are added according to equal weights and connected to output sequence features. The proposed method is extensively experimented on three public video datasets, MARS, DukeMTMC-VideoReID, and PRID-2011; the mean average precision (mAP) values are 82.4%, 94.1%, and 95.3% and Rank-1 values are 86.4%, 94.8%, and 95.2%, respectively.

摘要

相机视频监控中的部分遮挡和背景杂波会影响基于视频的人员重新识别(re-ID)的准确性。针对这些问题,我们提出了一种基于帧采样随机擦除和部分与全局特征互信息时间权重聚合的人员重新识别方法。首先,对于目标人员受到干扰或部分遮挡的情况,使用帧采样随机擦除(FSE)方法进行数据增强,有效缓解遮挡问题,提高模型的泛化能力,更准确地匹配人员。其次,为了进一步提高基于视频的人员重新识别的准确性,并学习更具判别力的特征表示,我们使用 ResNet-50 网络提取全局和部分特征,并融合这些特征以获得帧级特征。在时间维度上,基于互信息-时间权重聚合(MI-TWA)模块,根据不同的权重添加部分特征,并根据等权重添加全局特征,并连接输出序列特征。在三个公共视频数据集 MARS、DukeMTMC-VideoReID 和 PRID-2011 上进行了广泛的实验,平均精度(mAP)值分别为 82.4%、94.1%和 95.3%,排名第一(Rank-1)值分别为 86.4%、94.8%和 95.2%。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/08a5/9032512/7ae1b4679270/sensors-22-03047-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验