Suppr超能文献

使用深度神经网络混合模型的视觉显著性预测

Visual Saliency Prediction Using a Mixture of Deep Neural Networks.

作者信息

Dodge Samuel, Karam Lina

出版信息

IEEE Trans Image Process. 2018 May 9. doi: 10.1109/TIP.2018.2834826.

Abstract

Visual saliency models have recently begun to incorporate deep learning to achieve predictive capacity much greater than previous unsupervised methods. However, most existing models predict saliency without explicit knowledge of global scene semantic information. We propose a model (MxSalNet) that incorporates global scene semantic information in addition to local information gathered by a convolutional neural network. Our model is formulated as a mixture of experts. Each expert network is trained to predict saliency for a set of closely related images. The final saliency map is computed as a weighted mixture of the expert networks' output, with weights determined by a separate gating network. This gating network is guided by global scene information to predict weights. The expert networks and the gating network are trained simultaneously in an end-toend manner. We show that our mixture formulation leads to improvement in performance over an otherwise identical nonmixture model that does not incorporate global scene information. Additionally, we show that our model achieves better performance than several other visual saliency models.

摘要

视觉显著性模型最近开始融入深度学习,以实现比以前的无监督方法大得多的预测能力。然而,大多数现有模型在没有明确的全局场景语义信息的情况下预测显著性。我们提出了一种模型(MxSalNet),该模型除了包含卷积神经网络收集的局部信息外,还融入了全局场景语义信息。我们的模型被构建为一个专家混合模型。每个专家网络都经过训练,以预测一组密切相关图像的显著性。最终的显著性图是作为专家网络输出的加权混合来计算的,权重由一个单独的门控网络确定。这个门控网络由全局场景信息引导来预测权重。专家网络和门控网络以端到端的方式同时进行训练。我们表明,与不包含全局场景信息的相同非混合模型相比,我们的混合模型在性能上有所提升。此外,我们表明我们的模型比其他几种视觉显著性模型具有更好的性能。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验