Carneiro Gustavo, Chan Antoni B, Moreno Pedro J, Vasconcelos Nuno
Integrated Data Systems Department, Siemens Corporate Research, Princeton, NJ 08540, USA.
IEEE Trans Pattern Anal Mach Intell. 2007 Mar;29(3):394-410. doi: 10.1109/TPAMI.2007.61.
A probabilistic formulation for semantic image annotation and retrieval is proposed. Annotation and retrieval are posed as classification problems where each class is defined as the group of database images labeled with a common semantic label. It is shown that, by establishing this one-to-one correspondence between semantic labels and semantic classes, a minimum probability of error annotation and retrieval are feasible with algorithms that are 1) conceptually simple, 2) computationally efficient, and 3) do not require prior semantic segmentation of training images. In particular, images are represented as bags of localized feature vectors, a mixture density estimated for each image, and the mixtures associated with all images annotated with a common semantic label pooled into a density estimate for the corresponding semantic class. This pooling is justified by a multiple instance learning argument and performed efficiently with a hierarchical extension of expectation-maximization. The benefits of the supervised formulation over the more complex, and currently popular, joint modeling of semantic label and visual feature distributions are illustrated through theoretical arguments and extensive experiments. The supervised formulation is shown to achieve higher accuracy than various previously published methods at a fraction of their computational cost. Finally, the proposed method is shown to be fairly robust to parameter tuning.
提出了一种用于语义图像标注和检索的概率公式。标注和检索被视为分类问题,其中每个类别被定义为用共同语义标签标记的数据库图像组。结果表明,通过在语义标签和语义类别之间建立这种一一对应关系,使用以下算法可以实现最小错误概率的标注和检索:1)概念简单;2)计算效率高;3)不需要对训练图像进行先验语义分割。具体而言,图像被表示为局部特征向量包,为每个图像估计混合密度,并且将与用共同语义标签标注的所有图像相关联的混合合并为对应语义类别的密度估计。这种合并通过多实例学习论证得到证明,并通过期望最大化的分层扩展有效地执行。通过理论论证和大量实验说明了监督公式相对于更复杂且当前流行的语义标签和视觉特征分布联合建模的优势。结果表明,监督公式在计算成本仅为各种先前发表方法的一小部分的情况下,能够实现更高的准确率。最后,所提出的方法被证明对参数调整相当稳健。