IEEE Trans Image Process. 2015 Nov;24(11):4381-93. doi: 10.1109/TIP.2015.2463223. Epub 2015 Jul 30.
In this paper, we focus on face clustering in videos. To promote the performance of video clustering by multiple intrinsic cues, i.e., pairwise constraints and multiple views, we propose a constrained multi-view video face clustering method under a unified graph-based model. First, unlike most existing video face clustering methods which only employ these constraints in the clustering step, we strengthen the pairwise constraints through the whole video face clustering framework, both in sparse subspace representation and spectral clustering. In the constrained sparse subspace representation, the sparse representation is forced to explore unknown relationships. In the constrained spectral clustering, the constraints are used to guide for learning more reasonable new representations. Second, our method considers both the video face pairwise constraints as well as the multi-view consistence simultaneously. In particular, the graph regularization enforces the pairwise constraints to be respected and the co-regularization penalizes the disagreement among different graphs of multiple views. Experiments on three real-world video benchmark data sets demonstrate the significant improvements of our method over the state-of-the-art methods.
在本文中,我们专注于视频中的人脸聚类。为了通过多种内在线索(即成对约束和多视图)来促进视频聚类的性能,我们在基于统一图的模型下提出了一种约束的多视图视频人脸聚类方法。首先,与大多数仅在聚类步骤中使用这些约束的现有视频人脸聚类方法不同,我们通过稀疏子空间表示和谱聚类在整个视频人脸聚类框架中加强了成对约束。在约束稀疏子空间表示中,稀疏表示被强制探索未知关系。在约束谱聚类中,约束被用于指导学习更合理的新表示。其次,我们的方法同时考虑了视频人脸的成对约束和多视图一致性。具体来说,图正则化强制遵守成对约束,协同正则化惩罚多视图不同图之间的分歧。在三个真实视频基准数据集上的实验表明,我们的方法相对于最先进的方法有显著的改进。