School of Geography Science, Nanjing University of Information Science and Technology, Nanjing, China.
Upland Flue-cured Tobacco Quality & Ecology Key Laboratory of China Tobacco Guizhou Academy of Tobacco Science, Guiyang, China.
PLoS One. 2021 Mar 25;16(3):e0247028. doi: 10.1371/journal.pone.0247028. eCollection 2021.
Spectral similarity indices were used to select similar soil samples from a spectral library and improve the predictive accuracy of target samples. There are many similarity indices available, and precisely how to select the optimum index has become a critical question. Five similarity indices were evaluated: Spectral angle mapper (SAM), Euclidean distance (ED), Mahalanobis distance (MD), SAM_pca and ED_pca in the space of principal components applied to a global soil spectral library. The accordance between spectral and compositional similarity was used to select the optimum index. Then the optimum index was evaluated if it can maintain the greatest predictive accuracy when selecting similar samples from a spectral library for the prediction of a target sample using a partial least squares regression (PLSR) model. The evaluated physiochemical properties were: soil organic carbon, pH, cation exchange capacity (CEC), clay, silt, and sand content. SAM and SAM_pca selected samples were closer in composition compared to the target samples. Based on similar samples selected using these two indices, PLSR models achieved the highest predictive accuracy for all soil properties, save for CEC. This validates the hypothesis that the accordance information between spectral and compositional similarity can help select the appropriate similarity index when selecting similar samples from a spectral library for prediction.
光谱相似性指数被用于从光谱库中选择相似的土壤样本,以提高目标样本的预测精度。有许多相似性指数可供选择,如何精确选择最优指数已成为一个关键问题。本研究评估了五种相似性指数:光谱角制图(SAM)、欧几里得距离(ED)、马氏距离(MD)、主成分空间中的 PCA-SAM 和 PCA-ED,应用于全球土壤光谱库。利用光谱相似性和组成相似性的一致性来选择最优指数。然后,评估最优指数在从光谱库中选择相似样本以使用偏最小二乘回归(PLSR)模型预测目标样本时,是否能保持最大的预测精度。评估的理化性质包括:土壤有机碳、pH 值、阳离子交换量(CEC)、粘土、粉砂和砂含量。与目标样本相比,使用 SAM 和 SAM_pca 选择的样本在组成上更接近。基于这两个指数选择的相似样本,PLSR 模型对所有土壤性质(除 CEC 外)的预测精度最高。这验证了一个假设,即光谱相似性和组成相似性之间的一致性信息有助于在从光谱库中选择相似样本进行预测时选择适当的相似性指数。