Bajić Filip, Job Josip
University of Zagreb University Computing Centre, 10000 Zagreb, Croatia.
Faculty of Electrical Engineering, Computer Science and Information Technology Osijek, 31000 Osijek, Croatia.
J Imaging. 2021 Oct 21;7(11):220. doi: 10.3390/jimaging7110220.
In recovering information from the chart image, the first step should be chart type classification. Throughout history, many approaches have been used, and some of them achieve results better than others. The latest articles are using a Support Vector Machine (SVM) in combination with a Convolutional Neural Network (CNN), which achieve almost perfect results with the datasets of few thousand images per class. The datasets containing chart images are primarily synthetic and lack real-world examples. To overcome the problem of small datasets, to our knowledge, this is the first report of using Siamese CNN architecture for chart type classification. Multiple network architectures are tested, and the results of different dataset sizes are compared. The network verification is conducted using Few-shot learning (FSL). Many of described advantages of Siamese CNNs are shown in examples. In the end, we show that the Siamese CNN can work with one image per class, and a 100% average classification accuracy is achieved with 50 images per class, where the CNN achieves only average classification accuracy of 43% for the same dataset.
在从图表图像中恢复信息时,第一步应该是图表类型分类。纵观历史,人们使用了许多方法,其中一些方法取得的效果比其他方法更好。最新的文章使用支持向量机(SVM)与卷积神经网络(CNN)相结合的方法,在每个类别有数千张图像的数据集上取得了几乎完美的结果。包含图表图像的数据集主要是合成的,缺乏真实世界的示例。据我们所知,为了克服小数据集的问题,这是第一篇使用连体卷积神经网络(Siamese CNN)架构进行图表类型分类的报告。测试了多种网络架构,并比较了不同数据集大小的结果。使用少样本学习(FSL)进行网络验证。连体卷积神经网络的许多所述优点在示例中得到了展示。最后,我们表明连体卷积神经网络可以在每个类别一张图像的情况下工作,并且在每个类别有50张图像时实现了100%的平均分类准确率,而对于相同的数据集,卷积神经网络仅实现了43%的平均分类准确率。