IEEE Trans Image Process. 2016 Apr;25(4):1566-79. doi: 10.1109/TIP.2016.2522380. Epub 2016 Jan 27.
A large number of saliency models, each based on a different hypothesis, have been proposed over the past 20 years. In practice, while subscribing to one hypothesis or computational principle makes a model that performs well on some types of images, it hinders the general performance of a model on arbitrary images and large-scale data sets. One natural approach to improve overall saliency detection accuracy would then be fusing different types of models. In this paper, inspired by the success of late-fusion strategies in semantic analysis and multi-modal biometrics, we propose to fuse the state-of-the-art saliency models at the score level in a para-boosting learning fashion. First, saliency maps generated by several models are used as confidence scores. Then, these scores are fed into our para-boosting learner (i.e., support vector machine, adaptive boosting, or probability density estimator) to generate the final saliency map. In order to explore the strength of para-boosting learners, traditional transformation-based fusion strategies, such as Sum, Min, and Max, are also explored and compared in this paper. To further reduce the computation cost of fusing too many models, only a few of them are considered in the next step. Experimental results show that score-level fusion outperforms each individual model and can further reduce the performance gap between the current models and the human inter-observer model.
在过去的 20 年中,已经提出了大量基于不同假设的显著模型。在实践中,虽然遵循一种假设或计算原则可以使模型在某些类型的图像上表现良好,但它会阻碍模型在任意图像和大规模数据集上的整体性能。那么,提高整体显著检测精度的一种自然方法将是融合不同类型的模型。在本文中,受语义分析和多模态生物识别中晚期融合策略成功的启发,我们提出了在并行提升学习方式下,在分数级融合最先进的显著模型。首先,几个模型生成的显著图被用作置信度得分。然后,这些分数被输入到我们的并行提升学习器(即支持向量机、自适应提升或概率密度估计器)中,以生成最终的显著图。为了探索并行提升学习器的优势,本文还探索并比较了传统的基于变换的融合策略,如和、最小和最大。为了进一步降低融合太多模型的计算成本,在下一个步骤中只考虑其中的几个模型。实验结果表明,分数级融合优于每个单独的模型,并且可以进一步缩小当前模型和人类观察者模型之间的性能差距。