Agahian Farnaz, Funt Brian
J Opt Soc Am A Opt Image Sci Vis. 2014 Jul 1;31(7):1445-52. doi: 10.1364/JOSAA.31.001445.
The spectra in spectral reflectance datasets tend to be quite correlated and therefore they can be represented more compactly using standard techniques such as principal components analysis (PCA) as part of a lossy compression strategy. However, the presence of outlier spectra can often increase the overall error of the reconstructed spectra. This paper introduces a new outlier modeling (OM) method that detects, clusters, and separately models outliers with their own set of basis vectors. Outliers are defined in terms of the robust Mahalanobis distance using the fast minimum covariance determinant algorithm as a robust estimator of the multivariate mean and covariance from which it is computed. After removing the outliers from the main dataset, the performance of PCA on the remaining data improves significantly; however, since outlier spectra are a part of the image, they cannot simply be ignored. The solution is to cluster the outliers into a small number of clusters and then model each cluster separately using its own cluster-specific PCA-derived bases. Tests show that OM leads to lower spectral reconstruction errors of reflectance spectra in terms of both normalized RMS and goodness of fit.
光谱反射率数据集中的光谱往往具有很强的相关性,因此可以使用主成分分析(PCA)等标准技术作为有损压缩策略的一部分,更紧凑地表示它们。然而,异常光谱的存在通常会增加重建光谱的总体误差。本文介绍了一种新的异常值建模(OM)方法,该方法可以检测、聚类异常值,并使用其自己的一组基向量分别对异常值进行建模。使用快速最小协方差行列式算法作为多元均值和协方差的稳健估计器,根据稳健马氏距离来定义异常值,该距离由其计算得出。从主数据集中去除异常值后,PCA在剩余数据上的性能显著提高;然而,由于异常光谱是图像的一部分,不能简单地忽略它们。解决方案是将异常值聚类为少量的簇,然后使用其自己的特定于簇的PCA派生基分别对每个簇进行建模。测试表明,就归一化均方根和拟合优度而言,OM会降低反射光谱的光谱重建误差。