用于图像标注和检索的语义类别的监督学习。

Supervised learning of semantic classes for image annotation and retrieval.

作者信息

Carneiro Gustavo, Chan Antoni B, Moreno Pedro J, Vasconcelos Nuno

机构信息

Integrated Data Systems Department, Siemens Corporate Research, Princeton, NJ 08540, USA.

出版信息

IEEE Trans Pattern Anal Mach Intell. 2007 Mar;29(3):394-410. doi: 10.1109/TPAMI.2007.61.

DOI:10.1109/TPAMI.2007.61

PMID:17224611

Abstract

A probabilistic formulation for semantic image annotation and retrieval is proposed. Annotation and retrieval are posed as classification problems where each class is defined as the group of database images labeled with a common semantic label. It is shown that, by establishing this one-to-one correspondence between semantic labels and semantic classes, a minimum probability of error annotation and retrieval are feasible with algorithms that are 1) conceptually simple, 2) computationally efficient, and 3) do not require prior semantic segmentation of training images. In particular, images are represented as bags of localized feature vectors, a mixture density estimated for each image, and the mixtures associated with all images annotated with a common semantic label pooled into a density estimate for the corresponding semantic class. This pooling is justified by a multiple instance learning argument and performed efficiently with a hierarchical extension of expectation-maximization. The benefits of the supervised formulation over the more complex, and currently popular, joint modeling of semantic label and visual feature distributions are illustrated through theoretical arguments and extensive experiments. The supervised formulation is shown to achieve higher accuracy than various previously published methods at a fraction of their computational cost. Finally, the proposed method is shown to be fairly robust to parameter tuning.

摘要

提出了一种用于语义图像标注和检索的概率公式。标注和检索被视为分类问题，其中每个类别被定义为用共同语义标签标记的数据库图像组。结果表明，通过在语义标签和语义类别之间建立这种一一对应关系，使用以下算法可以实现最小错误概率的标注和检索：1）概念简单；2）计算效率高；3）不需要对训练图像进行先验语义分割。具体而言，图像被表示为局部特征向量包，为每个图像估计混合密度，并且将与用共同语义标签标注的所有图像相关联的混合合并为对应语义类别的密度估计。这种合并通过多实例学习论证得到证明，并通过期望最大化的分层扩展有效地执行。通过理论论证和大量实验说明了监督公式相对于更复杂且当前流行的语义标签和视觉特征分布联合建模的优势。结果表明，监督公式在计算成本仅为各种先前发表方法的一小部分的情况下，能够实现更高的准确率。最后，所提出的方法被证明对参数调整相当稳健。

相似文献

Supervised learning of semantic classes for image annotation and retrieval.

IEEE Trans Pattern Anal Mach Intell. 2007 Mar;29(3):394-410. doi: 10.1109/TPAMI.2007.61.

Localized content-based image retrieval.

IEEE Trans Pattern Anal Mach Intell. 2008 Nov;30(11):1902-12. doi: 10.1109/TPAMI.2008.112.

Automatic semantic annotation of real-world web images.

IEEE Trans Pattern Anal Mach Intell. 2008 Nov;30(11):1933-44. doi: 10.1109/TPAMI.2008.125.

Modeling semantic aspects for cross-media image indexing.

IEEE Trans Pattern Anal Mach Intell. 2007 Oct;29(10):1802-17. doi: 10.1109/TPAMI.2007.1097.

Universal and adapted vocabularies for generic visual categorization.

IEEE Trans Pattern Anal Mach Intell. 2008 Jul;30(7):1243-56. doi: 10.1109/TPAMI.2007.70755.

Document image retrieval through word shape coding.

IEEE Trans Pattern Anal Mach Intell. 2008 Nov;30(11):1913-8. doi: 10.1109/TPAMI.2008.89.

Real-time computerized annotation of pictures.

IEEE Trans Pattern Anal Mach Intell. 2008 Jun;30(6):985-1002. doi: 10.1109/TPAMI.2007.70847.

80 million tiny images: a large data set for nonparametric object and scene recognition.

IEEE Trans Pattern Anal Mach Intell. 2008 Nov;30(11):1958-70. doi: 10.1109/TPAMI.2008.128.

Context-based object-class recognition and retrieval by generalized correlograms.

IEEE Trans Pattern Anal Mach Intell. 2007 Oct;29(10):1818-33. doi: 10.1109/TPAMI.2007.1098.

Active learning methods for interactive image retrieval.

IEEE Trans Image Process. 2008 Jul;17(7):1200-11. doi: 10.1109/TIP.2008.924286.

引用本文的文献

Hybrid Encryption Method for Health Monitoring Systems Based on Machine Learning.

Comput Intell Neurosci. 2022 Jul 7;2022:7348488. doi: 10.1155/2022/7348488. eCollection 2022.

Automated Diagnostics: Advances in the Diagnosis of Intestinal Parasitic Infections in Humans and Animals.

Front Vet Sci. 2021 Nov 23;8:715406. doi: 10.3389/fvets.2021.715406. eCollection 2021.

Comprehensive strategies of machine-learning-based quantitative structure-activity relationship models.

iScience. 2021 Aug 28;24(9):103052. doi: 10.1016/j.isci.2021.103052. eCollection 2021 Sep 24.

The VISIONE Video Search System: Exploiting Off-the-Shelf Text Search Engines for Large-Scale Video Retrieval.

J Imaging. 2021 Apr 23;7(5):76. doi: 10.3390/jimaging7050076.

A Novel Model on Reinforce K-Means Using Location Division Model and Outlier of Initial Value for Lowering Data Cost.

Entropy (Basel). 2020 Aug 17;22(8):902. doi: 10.3390/e22080902.

Learning layer-specific edges for segmenting retinal layers with large deformations.

Biomed Opt Express. 2016 Jun 30;7(7):2888-901. doi: 10.1364/BOE.7.002888. eCollection 2016 Jul 1.

Object recognition with hierarchical discriminant saliency networks.

Front Comput Neurosci. 2014 Sep 9;8:109. doi: 10.3389/fncom.2014.00109. eCollection 2014.

An Adaboost-backpropagation neural network for automated image sentiment classification.

ScientificWorldJournal. 2014;2014:364649. doi: 10.1155/2014/364649. Epub 2014 Aug 4.

The National Cancer Informatics Program (NCIP) Annotation and Image Markup (AIM) Foundation model.

J Digit Imaging. 2014 Dec;27(6):692-701. doi: 10.1007/s10278-014-9710-3.

A semantic medical multimedia retrieval approach using ontology information hiding.

Comput Math Methods Med. 2013;2013:407917. doi: 10.1155/2013/407917. Epub 2013 Sep 9.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

用于图像标注和检索的语义类别的监督学习。

Supervised learning of semantic classes for image annotation and retrieval.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献