College of Information and Electrical Engineering, Heilongjiang Bayi Agricultural University, Daqing 163319, China.
College of Information and Electrical Engineering, Heilongjiang Bayi Agricultural University, Daqing 163319, China.
Spectrochim Acta A Mol Biomol Spectrosc. 2022 Dec 5;282:121630. doi: 10.1016/j.saa.2022.121630. Epub 2022 Jul 26.
Laplacian Eigenmaps is a nonlinear dimensionality reduction algorithm based on graph theory. The algorithm adopted the Gaussian function to measure the affinity between a pair of points in the adjacency graph. However, the scaling parameter σ in the Gaussian function is a hyper-parameter tuned empirically. Once the value of σ is determined and fixed, the weight between two points depends wholly on the Euclidian distance between them, which is not suitable for multi-scale sample sets. To optimize the weight between two points in the adjacency graph and make the weight reflect the scale information of different sample sets, an adaptive LE improved algorithm is used in this paper. Considering the influence of adjacent sample points and multi-scale data, the Euclidean distance between the k-th nearest sample point to sample point x is used as the local scaling parameter σ of x, instead of using a single scaling parameter σ. The efficiency of the algorithm is testified by applying on two public near-infrared data sets. LE-SVR and ALE-SVR models are established after LE and ALE dimension reduction of SNV preprocessed data sets. Compared with the LE-SVR model, the R and RPD of the ALE-SVR model established on the two data sets are improved, while RMSE is decreased, indicating that the prediction effect and stability of the regression model are established by the ALE algorithm are better than that of the traditional LE algorithm. Experiments show that the ALE algorithm can achieve a better dimensionality reduction effect than the LE algorithm.
拉普拉斯特征映射是一种基于图论的非线性降维算法。该算法采用高斯函数来度量邻接图中一对点之间的相似度。然而,高斯函数中的标度参数 σ 是通过经验调整的超参数。一旦 σ 的值确定并固定,两点之间的权重完全取决于它们之间的欧几里得距离,这对于多尺度样本集是不适用的。为了优化邻接图中两点之间的权重,并使权重反映不同样本集的尺度信息,本文采用了一种自适应 LE 改进算法。该算法考虑了相邻样本点和多尺度数据的影响,使用样本点 x 的第 k 近邻样本点与 x 之间的欧几里得距离作为 x 的局部标度参数 σ,而不是使用单个标度参数 σ。通过应用于两个公共近红外数据集,验证了算法的效率。在对 SNV 预处理数据集进行 LE 和 ALE 降维后,建立了 LE-SVR 和 ALE-SVR 模型。与 LE-SVR 模型相比,在两个数据集上建立的 ALE-SVR 模型的 R 和 RPD 得到了提高,而 RMSE 降低了,这表明 ALE 算法建立的回归模型的预测效果和稳定性要好于传统的 LE 算法。实验表明,ALE 算法可以比 LE 算法达到更好的降维效果。