Institute of Theoretical and Applied Informatics, Polish Academy of Sciences, 44-100 Gliwice, Poland.
Sensors (Basel). 2020 Nov 21;20(22):6666. doi: 10.3390/s20226666.
In recent years, growing interest in deep learning neural networks has raised a question on how they can be used for effective processing of high-dimensional datasets produced by hyperspectral imaging (HSI). HSI, traditionally viewed as being within the scope of remote sensing, is used in non-invasive substance classification. One of the areas of potential application is forensic science, where substance classification on the scenes is important. An example problem from that area-blood stain classification-is a case study for the evaluation of methods that process hyperspectral data. To investigate the deep learning classification performance for this problem we have performed experiments on a dataset which has not been previously tested using this kind of model. This dataset consists of several images with blood and blood-like substances like ketchup, tomato concentrate, artificial blood, etc. To test both the classic approach to hyperspectral classification and a more realistic application-oriented scenario, we have prepared two different sets of experiments. In the first one, Hyperspectral Transductive Classification (HTC), both a training and a test set come from the same image. In the second one, Hyperspectral Inductive Classification (HIC), a test set is derived from a different image, which is more challenging for classifiers but more useful from the point of view of forensic investigators. We conducted the study using several architectures like 1D, 2D and 3D convolutional neural networks (CNN), a recurrent neural network (RNN) and a multilayer perceptron (MLP). The performance of the models was compared with baseline results of Support Vector Machine (SVM). We have also presented a model evaluation method based on t-SNE and confusion matrix analysis that allows us to detect and eliminate some cases of model undertraining. Our results show that in the transductive case, all models, including the MLP and the SVM, have comparative performance, with no clear advantage of deep learning models. The Overall Accuracy range across all models is 98-100% for the easier image set, and 74-94% for the more difficult one. However, in a more challenging inductive case, selected deep learning architectures offer a significant advantage; their best Overall Accuracy is in the range of 57-71%, improving the baseline set by the non-deep models by up to 9 percentage points. We have presented a detailed analysis of results and a discussion, including a summary of conclusions for each tested architecture. An analysis of per-class errors shows that the score for each class is highly model-dependent. Considering this and the fact that the best performing models come from two different architecture families (3D CNN and RNN), our results suggest that tailoring the deep neural network architecture to hyperspectral data is still an open problem.
近年来,人们对深度学习神经网络的兴趣日益浓厚,这引发了一个问题,即如何将其应用于高维数据集的有效处理,这些数据集是由高光谱成像(HSI)产生的。高光谱成像传统上被视为遥感的范畴,用于非侵入式物质分类。潜在应用领域之一是法医学,其中在现场进行物质分类很重要。来自该领域的一个示例问题——血斑分类——是评估处理高光谱数据的方法的案例研究。为了研究该问题的深度学习分类性能,我们在一个以前没有使用这种模型进行测试的数据集上进行了实验。该数据集由几幅带有血和类似血的物质(如番茄酱、番茄浓缩物、人造血等)的图像组成。为了测试经典的高光谱分类方法和更现实的面向应用的场景,我们准备了两组不同的实验。在第一种情况下,高光谱传递分类(HTC)中,训练集和测试集都来自同一张图像。在第二种情况下,高光谱归纳分类(HIC)中,测试集来自不同的图像,这对分类器来说更具挑战性,但从法医调查人员的角度来看更有用。我们使用了几种架构,如一维、二维和三维卷积神经网络(CNN)、循环神经网络(RNN)和多层感知机(MLP)进行了研究。我们比较了模型的性能与支持向量机(SVM)的基线结果。我们还提出了一种基于 t-SNE 和混淆矩阵分析的模型评估方法,该方法允许我们检测和消除一些模型欠拟合的情况。我们的结果表明,在传递情况下,所有模型,包括 MLP 和 SVM,都具有相当的性能,深度学习模型没有明显优势。所有模型在较简单的图像集的整体准确率范围为 98-100%,在较困难的图像集的整体准确率范围为 74-94%。然而,在更具挑战性的归纳情况下,选定的深度学习架构提供了显著优势;它们的最佳整体准确率在 57-71%范围内,将非深度学习模型的基线提高了 9 个百分点。我们对结果进行了详细的分析和讨论,包括对每个测试架构的结论总结。对每类错误的分析表明,每个类的得分高度依赖于模型。考虑到这一点以及表现最好的模型来自两个不同的架构家族(3D CNN 和 RNN)的事实,我们的结果表明,针对高光谱数据定制深度神经网络架构仍然是一个悬而未决的问题。