Department of Electrical and Computer Engineering, University of Puerto Rico, Mayaguez, PR 00681, USA.
NASA Glenn Research Center, 21000 Brookpark Rd, Cleveland, OH 44135, USA.
Sensors (Basel). 2022 Feb 18;22(4):1623. doi: 10.3390/s22041623.
Hyperspectral remote sensing has tremendous potential for monitoring land cover and water bodies from the rich spatial and spectral information contained in the images. It is a time and resource consuming task to obtain groundtruth data for these images by field sampling. A semi-supervised method for labeling and classification of hyperspectral images is presented. The unsupervised stage consists of image enhancement by feature extraction, followed by clustering for labeling and generating the groundtruth image. The supervised stage for classification consists of a preprocessing stage involving normalization, computation of principal components, and feature extraction. An ensemble of machine learning models takes the extracted features and groundtruth data from the unsupervised stage as input and a decision block then combines the output of the machines to label the image based on majority voting. The ensemble of machine learning methods includes support vector machines, gradient boosting, Gaussian classifier, and linear perceptron. Overall, the gradient boosting method gives the best performance for supervised classification of hyperspectral images. The presented ensemble method is useful for generating labeled data for hyperspectral images that do not have groundtruth information. It gives an overall accuracy of 93.74% for the Jasper hyperspectral image, 100% accuracy for the HSI2 Lake Erie images, and 99.92% for the classification of cyanobacteria or harmful algal blooms and surface scum. The method distinguishes well between blue green algae and surface scum. The full pipeline ensemble method for classifying Lake Erie images in a cloud server runs 24 times faster than a workstation.
高光谱遥感具有从图像中包含的丰富空间和光谱信息监测土地覆盖和水体的巨大潜力。通过野外采样获取这些图像的地面实况数据是一项耗时耗资源的任务。提出了一种用于高光谱图像标记和分类的半监督方法。无监督阶段包括通过特征提取进行图像增强,然后进行聚类以进行标记并生成地面实况图像。分类的监督阶段包括预处理阶段,涉及归一化、主成分计算和特征提取。一组机器学习模型采用无监督阶段提取的特征和地面实况数据作为输入,决策块然后结合机器的输出,根据多数投票对图像进行标记。机器学习方法的集合包括支持向量机、梯度提升、高斯分类器和线性感知机。总体而言,梯度提升方法在高光谱图像的监督分类中表现最佳。所提出的集成方法可用于生成没有地面实况信息的高光谱图像的标记数据。它为 Jasper 高光谱图像提供了 93.74%的总体准确性,为 HSI2 伊利湖图像提供了 100%的准确性,为蓝藻或有害藻类和表面浮渣的分类提供了 99.92%的准确性。该方法可以很好地区分蓝藻和表面浮渣。在云服务器中分类伊利湖图像的完整流水线集成方法比工作站快 24 倍。