Lin Wei-Chao, Oakes Michael, Tait John, Tsai Chih-Fong
Department of Computing, Engineering and Technology, University of Sunderland, Sunderland, SR6 0DD, UK.
Cogn Process. 2009 Aug;10(3):233-42. doi: 10.1007/s10339-008-0247-6. Epub 2008 Dec 13.
This paper describes the automatic assignment of images into classes described by individual keywords provided with the Corel data set. Automatic image annotation technology aims to provide an efficient and effective searching environment for users to query their images more easily, but current image retrieval systems are still not very accurate when assigning images into a large number of keyword classes. Noisy features are the main problem, causing some keywords never to be assigned to their correct images. This paper focuses on improving image classification, first by selection of features to characterise each image, and then the selection of the most suitable feature vectors as training data. A Pixel Density filter (PDfilter) and Information Gain (IG) are proposed to perform these respective tasks. We filter out the noisy features so that groups of images can be represented by their most important values. The experiments use hue, saturation and value (HSV) colour feature space to categorise images according to one of 190 concrete keywords or subsets of these. The study shows that feature selection through the PDfilter and IG can improve the problem of spurious similarity.
本文描述了如何将图像自动分类到由Corel数据集中提供的各个关键词所描述的类别中。自动图像标注技术旨在为用户提供一个高效且有效的搜索环境,以便他们更轻松地查询图像,但当前的图像检索系统在将图像分配到大量关键词类别时仍然不是非常准确。噪声特征是主要问题,导致一些关键词永远无法被分配到正确的图像上。本文着重于改进图像分类,首先通过选择特征来表征每个图像,然后选择最合适的特征向量作为训练数据。为此提出了像素密度滤波器(PDfilter)和信息增益(IG)来分别执行这些任务。我们滤除噪声特征,以便图像组能够由其最重要的值来表示。实验使用色相、饱和度和明度(HSV)颜色特征空间,根据190个具体关键词或这些关键词的子集之一对图像进行分类。研究表明,通过PDfilter和IG进行特征选择可以改善虚假相似性问题。