Department of Information Technology, VNUHCM-University of Science, HCM 70000, Vietnam.
Researcher at AIOZ Pte Ltd, HCM 70000, Vietnam.
Comput Intell Neurosci. 2019 Jul 18;2019:1483294. doi: 10.1155/2019/1483294. eCollection 2019.
Object retrieval plays an increasingly important role in video surveillance, digital marketing, e-commerce, etc. It is facing challenges such as large-scale datasets, imbalanced data, viewpoint, cluster background, and fine-grained details (attributes). This paper has proposed a model to integrate object ontology, a local multitask deep neural network (local MDNN), and an imbalanced data solver to take advantages and overcome the shortcomings of deep learning network models to improve the performance of the large-scale object retrieval system from the coarse-grained level (categories) to the fine-grained level (attributes). Our proposed coarse-to-fine object retrieval (CFOR) system can be robust and resistant to the challenges listed above. To the best of our knowledge, the new main point of our CFOR system is the power of mutual support of object ontology, a local MDNN, and an imbalanced data solver in a unified system. Object ontology supports the exploitation of the inner-group correlations to improve the system performance in category classification, attribute classification, and conducting training flow and retrieval flow to save computational costs in the training stage and retrieval stage on large-scale datasets, respectively. A local MDNN supports linking object ontology to the raw data, and an imbalanced data solver based on Matthews' correlation coefficient (MCC) addresses that the imbalance of data has contributed effectively to increasing the quality of object ontology realization without adjusting network architecture and data augmentation. In order to evaluate the performance of the CFOR system, we experimented on the DeepFashion dataset. This paper has shown that our local MDNN framework based on the pretrained NASNet architecture has achieved better performance (14.2% higher in recall rate) compared to single-task learning (STL) in the attribute learning task; it has also shown that our model with an imbalanced data solver has achieved better performance (5.14% higher in recall rate for fewer data attributes) compared to models that do not take this into account. Moreover, MAP@30 hovers 0.815 in retrieval on an average of 35 imbalanced fashion attributes.
目标检索在视频监控、数字营销、电子商务等领域发挥着越来越重要的作用。它面临着大规模数据集、数据不平衡、视角、聚类背景和细粒度细节(属性)等挑战。本文提出了一种模型,将目标本体、局部多任务深度神经网络(局部 MDNN)和不平衡数据求解器集成在一起,利用和克服深度学习网络模型的缺点,从粗粒度级别(类别)提高大规模目标检索系统的性能到细粒度级别(属性)。我们提出的从粗到细的目标检索(CFOR)系统可以具有鲁棒性,并能够抵抗上述挑战。据我们所知,我们的 CFOR 系统的新要点是目标本体、局部 MDNN 和不平衡数据求解器在统一系统中的相互支持的力量。目标本体支持挖掘内组相关性,以提高类别分类、属性分类以及进行训练流程和检索流程的系统性能,从而分别在大规模数据集的训练阶段和检索阶段节省计算成本。局部 MDNN 支持将目标本体链接到原始数据,基于马修斯相关系数(MCC)的不平衡数据求解器有效地解决了数据不平衡问题,而无需调整网络架构和数据增强。为了评估 CFOR 系统的性能,我们在 DeepFashion 数据集上进行了实验。本文表明,我们基于预训练的 NASNet 架构的局部 MDNN 框架在属性学习任务中比单任务学习(STL)取得了更好的性能(召回率提高了 14.2%);它还表明,我们的带有不平衡数据求解器的模型比不考虑这一点的模型取得了更好的性能(对于较少的数据属性,召回率提高了 5.14%)。此外,在平均 35 个不平衡的时尚属性的检索中,MAP@30 徘徊在 0.815 左右。