Song Ran, Liu Yonghuai, Rosin Paul L
IEEE Trans Vis Comput Graph. 2021 Jan;27(1):151-164. doi: 10.1109/TVCG.2019.2928794. Epub 2020 Nov 24.
Recently, effort has been made to apply deep learning to the detection of mesh saliency. However, one major barrier is to collect a large amount of vertex-level annotation as saliency ground truth for training the neural networks. Quite a few pilot studies showed that this task is difficult. In this work, we solve this problem by developing a novel network trained in a weakly supervised manner. The training is end-to-end and does not require any saliency ground truth but only the class membership of meshes. Our Classification-for-Saliency CNN (CfS-CNN) employs a multi-view setup and contains a newly designed two-channel structure which integrates view-based features of both classification and saliency. It essentially transfers knowledge from 3D object classification to mesh saliency. Our approach significantly outperforms the existing state-of-the-art methods according to extensive experimental results. Also, the CfS-CNN can be directly used for scene saliency. We showcase two novel applications based on scene saliency to demonstrate its utility.
最近,人们致力于将深度学习应用于网格显著性检测。然而,一个主要障碍是收集大量顶点级注释作为训练神经网络的显著性真值。不少初步研究表明,这项任务很困难。在这项工作中,我们通过开发一种以弱监督方式训练的新型网络来解决这个问题。训练是端到端的,不需要任何显著性真值,只需要网格的类别成员信息。我们的显著性分类卷积神经网络(CfS-CNN)采用多视图设置,并包含一个新设计的双通道结构,该结构整合了基于视图的分类和显著性特征。它本质上是将知识从3D对象分类转移到网格显著性。根据大量实验结果,我们的方法显著优于现有的最先进方法。此外,CfS-CNN可以直接用于场景显著性。我们展示了基于场景显著性的两个新颖应用,以证明其效用。