Zeng Shujun, Wang Xueping, Liu Min, Liu Qing, Wang Yaonan
IEEE Trans Neural Netw Learn Syst. 2024 Jan;35(1):1013-1024. doi: 10.1109/TNNLS.2022.3179133. Epub 2024 Jan 4.
Video-based person re-identification (re-id) has attracted a significant attention in recent years due to the increasing demand of video surveillance. However, existing methods are usually based on the supervised learning, which requires vast labeled identities across cameras and is not suitable for real scenes. Although some unsupervised approaches have been proposed for video re-id, their performance is far from satisfactory. In this article, we propose an unsupervised anchor association learning (UAAL) framework to address the video-based person re-id task, in which the feature representation of each sampled tracklet is regarded as an anchor. Specifically, we first propose an intracamera anchor association learning (IAAL) term that learns the discriminative anchor by utilizing the affiliation relations between an image and the anchors in each camera. Then, the exponential moving average (EMA) strategy is employed to update the anchor and the updated anchors are stored into an anchor memory module. On top of that, a cross-camera anchor association learning (CAAL) term is introduced to mine potential positive anchor pairs across cameras by presenting a cyclic ranking anchor alignment and threshold filtering method. Extensive experiments conducted on two public datasets show the superiority of the proposed method; for example, our method achieves 73.2% for rank-1 accuracy and 60.1% for mean average precision (mAP) score, respectively, on MARS, similarly 89.7% and 87.0% on DukeMTMC-VideoReID.
近年来,由于视频监控需求的不断增加,基于视频的行人重识别(re-id)受到了广泛关注。然而,现有方法通常基于监督学习,这需要跨摄像头的大量标注身份,且不适用于真实场景。尽管已经提出了一些无监督方法用于视频重识别,但其性能仍远不能令人满意。在本文中,我们提出了一种无监督锚点关联学习(UAAL)框架来解决基于视频的行人重识别任务,其中每个采样轨迹段的特征表示被视为一个锚点。具体来说,我们首先提出了一个相机内锚点关联学习(IAAL)项,通过利用图像与每个相机中的锚点之间的从属关系来学习判别性锚点。然后,采用指数移动平均(EMA)策略来更新锚点,并将更新后的锚点存储到一个锚点记忆模块中。在此基础上,引入了一个跨相机锚点关联学习(CAAL)项,通过提出一种循环排序锚点对齐和阈值过滤方法来挖掘跨相机的潜在正锚点对。在两个公共数据集上进行的大量实验表明了所提方法的优越性;例如,我们的方法在MARS数据集上分别实现了73.2%的Rank-1准确率和60.1%的平均精度均值(mAP)得分,在DukeMTMC-VideoReID数据集上类似地实现了89.7%和87.0%。