School of Academy for Advanced Interdisciplinary Studies, Peking University, Beijing, China.
National Engineering Laboratory for Video Technology, School of EECS, Peking University, Beijing, China.
Sci Rep. 2020 Apr 28;10(1):7146. doi: 10.1038/s41598-020-63649-6.
Most of the existing recognition algorithms are proposed for closed set scenarios, where all categories are known beforehand. However, in practice, recognition is essentially an open set problem. There are categories we know called "knowns", and there are more we do not know called "unknowns". Enumerating all categories beforehand is never possible, consequently, it is infeasible to prepare sufficient training samples for those unknowns. Applying closed set recognition methods will naturally lead to unseen-category errors. To address this problem, we propose the prototype-based Open Deep Network (P-ODN) for open set recognition tasks. Specifically, we introduce prototype learning into open set recognition. Prototypes and prototype radiuses are trained jointly to guide a CNN network to derive more discriminative features. Then P-ODN detects the unknowns by applying a multi-class triplet thresholding method based on the distance metric between features and prototypes. Manual labeling the unknowns which are detected in the previous process as new categories. Predictors for new categories are added to the classification layer to "open" the deep neural networks to incorporate new categories dynamically. The weights of new predictors are initialized exquisitely by applying a distances based algorithm to transfer the learned knowledge. Consequently, this initialization method speeds up the fine-tuning process and reduce the samples needed to train new predictors. Extensive experiments show that P-ODN can effectively detect unknowns and needs only few samples with human intervention to recognize a new category. In the real world scenarios, our method achieves state-of-the-art performance on the UCF11, UCF50, UCF101 and HMDB51 datasets.
大多数现有的识别算法都是针对封闭集场景提出的,在这种场景下,所有类别都是事先已知的。然而,在实际应用中,识别本质上是一个开放集问题。我们知道有一些类别称为“已知类”,还有更多我们不知道的类别称为“未知类”。事先枚举所有类别是不可能的,因此,为那些未知类别准备足够的训练样本是不可行的。应用封闭集识别方法自然会导致未见过类别的错误。为了解决这个问题,我们提出了原型基开放式深度网络(P-ODN)用于开放式集识别任务。具体来说,我们将原型学习引入开放式集识别中。原型和原型半径被联合训练,以指导 CNN 网络得出更具区分性的特征。然后,P-ODN 通过应用基于距离度量的多类三元组阈值方法来检测未知类。将在前一过程中检测到的未知类手动标记为新类别。将新类别的预测器添加到分类层中,以“打开”深度神经网络,从而动态地纳入新类别。通过应用基于距离的算法来转移学习到的知识,可以精细地初始化新预测器的权重。因此,这种初始化方法可以加快微调过程并减少训练新预测器所需的样本。大量实验表明,P-ODN 可以有效地检测未知类,并只需少量样本的人工干预即可识别新类别。在真实场景中,我们的方法在 UCF11、UCF50、UCF101 和 HMDB51 数据集上实现了最先进的性能。