基于多域多任务迁移深度网络的预测用户点击特征的图像识别。

IEEE Trans Image Process. 2019 Dec;28(12):6047-6062. doi: 10.1109/TIP.2019.2921861. Epub 2019 Jun 28.

The click feature of an image, defined as a user click count vector based on click data, has been demonstrated to be effective for reducing the semantic gap for image recognition. Unfortunately, most of the traditional image recognition datasets do not contain click data. To address this problem, researchers have begun to develop a click prediction model using assistant datasets containing click information and have adapted this predictor to a common click-free dataset for different tasks. This method can be customized to our problem, but it has two main limitations: 1) the predicted click feature often performs badly in the recognition task since the prediction model is constructed independently of the subsequent recognition problem and 2) transferring the predictor from one dataset to another is challenging due to the large cross-domain diversity. In this paper, we devise a multitask and multidomain deep network with varied modals (MTMDD-VM) to formulate image recognition and click prediction tasks in a unified framework. Datasets with and without click information are integrated in the training. Furthermore, a nonlinear word embedding with a position-sensitive loss function is designed to discover the visual click correlation. We evaluate the proposed method on three public dog breed image datasets, and we utilize the Clickture-Dog dataset as the auxiliary dataset that provides click data. The experimental results show that: 1) the nonlinear word embedding and position-sensitive loss function largely enhance the predicted click feature in the recognition task, realizing a 32% improvement in accuracy; 2) the multitask learning framework improves accuracies in both image recognition and click prediction; and 3) the unified training using the combined dataset with and without click data further improves the performance. Compared with the state-of-the-art methods, the proposed approach not only performs much better in accuracy but also achieves good scalability and one-shot learning ability.

图像的点击特征，定义为基于点击数据的用户点击计数向量，已被证明可以有效减少图像识别中的语义差距。不幸的是，大多数传统的图像识别数据集不包含点击数据。为了解决这个问题，研究人员开始使用包含点击信息的辅助数据集开发点击预测模型，并将该预测器适应于不同任务的常见无点击数据集。这种方法可以针对我们的问题进行定制，但它有两个主要限制：1）预测的点击特征在识别任务中表现不佳，因为预测模型是独立于后续识别问题构建的；2）由于跨域多样性大，将预测器从一个数据集转移到另一个数据集具有挑战性。在本文中，我们设计了一个具有多种模态的多任务和多域深度网络（MTMDD-VM），以统一的框架来制定图像识别和点击预测任务。有和没有点击信息的数据集都在训练中进行了整合。此外，我们设计了一个具有非线性词嵌入和位置敏感损失函数的非线性词嵌入，以发现视觉点击相关性。我们在三个公开的犬种图像数据集上评估了所提出的方法，并利用 Clickture-Dog 数据集作为提供点击数据的辅助数据集。实验结果表明：1）非线性词嵌入和位置敏感损失函数极大地增强了识别任务中的预测点击特征，准确率提高了 32%；2）多任务学习框架提高了图像识别和点击预测的准确率；3）使用带有和不带点击数据的组合数据集进行统一训练进一步提高了性能。与最先进的方法相比，所提出的方法不仅在准确性方面表现更好，而且具有良好的可扩展性和一次性学习能力。

相似文献

Image Recognition by Predicted User Click Feature With Multidomain Multitask Transfer Deep Network.

IEEE Trans Image Process. 2019 Dec;28(12):6047-6062. doi: 10.1109/TIP.2019.2921861. Epub 2019 Jun 28.

Hierarchical Deep Click Feature Prediction for Fine-Grained Image Recognition.

IEEE Trans Pattern Anal Mach Intell. 2022 Feb;44(2):563-578. doi: 10.1109/TPAMI.2019.2932058. Epub 2022 Jan 7.

Task-Oriented Feature-Fused Network With Multivariate Dataset for Joint Face Analysis.

IEEE Trans Cybern. 2020 Mar;50(3):1292-1305. doi: 10.1109/TCYB.2019.2917049. Epub 2019 Jun 5.

Deep Aesthetic Quality Assessment With Semantic Information.

IEEE Trans Image Process. 2017 Mar;26(3):1482-1495. doi: 10.1109/TIP.2017.2651399. Epub 2017 Jan 11.

Compositional model based on factorial evolution for realizing multi-task learning in bacterial virulent protein prediction.

Artif Intell Med. 2019 Nov;101:101757. doi: 10.1016/j.artmed.2019.101757. Epub 2019 Nov 7.

Transductive multi-view zero-shot learning.

IEEE Trans Pattern Anal Mach Intell. 2015 Nov;37(11):2332-45. doi: 10.1109/TPAMI.2015.2408354.

Learning of Multimodal Representations With Random Walks on the Click Graph.

IEEE Trans Image Process. 2016 Feb;25(2):630-42. doi: 10.1109/TIP.2015.2507401. Epub 2015 Dec 9.

Unifying Visual Attribute Learning with Object Recognition in a Multiplicative Framework.

IEEE Trans Pattern Anal Mach Intell. 2019 Jul;41(7):1747-1760. doi: 10.1109/TPAMI.2018.2836461. Epub 2018 Jun 4.

A multitask bi-directional RNN model for named entity recognition on Chinese electronic medical records.

BMC Bioinformatics. 2018 Dec 28;19(Suppl 17):499. doi: 10.1186/s12859-018-2467-9.

Task Sensitive Feature Exploration and Learning for Multitask Graph Classification.

IEEE Trans Cybern. 2017 Mar;47(3):744-758. doi: 10.1109/TCYB.2016.2526058. Epub 2016 Mar 10.

Suppr 超能文献

核心技术专利：CN118964589B侵权必究

相似文献

Image Recognition by Predicted User Click Feature With Multidomain Multitask Transfer Deep Network.

IEEE Trans Image Process. 2019 Dec;28(12):6047-6062. doi: 10.1109/TIP.2019.2921861. Epub 2019 Jun 28.

Hierarchical Deep Click Feature Prediction for Fine-Grained Image Recognition.

IEEE Trans Pattern Anal Mach Intell. 2022 Feb;44(2):563-578. doi: 10.1109/TPAMI.2019.2932058. Epub 2022 Jan 7.

Task-Oriented Feature-Fused Network With Multivariate Dataset for Joint Face Analysis.

IEEE Trans Cybern. 2020 Mar;50(3):1292-1305. doi: 10.1109/TCYB.2019.2917049. Epub 2019 Jun 5.

Deep Aesthetic Quality Assessment With Semantic Information.

IEEE Trans Image Process. 2017 Mar;26(3):1482-1495. doi: 10.1109/TIP.2017.2651399. Epub 2017 Jan 11.

Compositional model based on factorial evolution for realizing multi-task learning in bacterial virulent protein prediction.

Artif Intell Med. 2019 Nov;101:101757. doi: 10.1016/j.artmed.2019.101757. Epub 2019 Nov 7.

Transductive multi-view zero-shot learning.

IEEE Trans Pattern Anal Mach Intell. 2015 Nov;37(11):2332-45. doi: 10.1109/TPAMI.2015.2408354.

Learning of Multimodal Representations With Random Walks on the Click Graph.

IEEE Trans Image Process. 2016 Feb;25(2):630-42. doi: 10.1109/TIP.2015.2507401. Epub 2015 Dec 9.

Unifying Visual Attribute Learning with Object Recognition in a Multiplicative Framework.

IEEE Trans Pattern Anal Mach Intell. 2019 Jul;41(7):1747-1760. doi: 10.1109/TPAMI.2018.2836461. Epub 2018 Jun 4.

A multitask bi-directional RNN model for named entity recognition on Chinese electronic medical records.

BMC Bioinformatics. 2018 Dec 28;19(Suppl 17):499. doi: 10.1186/s12859-018-2467-9.

Task Sensitive Feature Exploration and Learning for Multitask Graph Classification.

IEEE Trans Cybern. 2017 Mar;47(3):744-758. doi: 10.1109/TCYB.2016.2526058. Epub 2016 Mar 10.

Image Recognition by Predicted User Click Feature With Multidomain Multitask Transfer Deep Network.

出版信息

相似文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献