Zhang Lingzhi, Dai Haomin, Zhang Jialin, Zheng Zhiqiang, Song Bo, Chen Jiaya, Lin Gang, Chen Linhai, Sun Weijiang, Huang Yan
College of Horticulture, Fujian Agriculture and Forestry University, Fuzhou 350002, China.
LiuMiao White Tea Corporation, Fuding 355200, China.
Foods. 2023 Jan 21;12(3):499. doi: 10.3390/foods12030499.
Identifying the geographical origins of white tea is of significance because the quality and price of white tea from different production areas vary largely from different growing environment and climatic conditions. In this study, we used near-infrared spectroscopy (NIRS) with white tea ( = 579) to produce models to discriminate these origins under different conditions. Continuous wavelet transform (CWT), min-max normalization (Minmax), multiplicative scattering correction (MSC) and standard normal variables (SNV) were used to preprocess the original spectra (OS). The approaches of principal component analysis (PCA), linear discriminant analysis (LDA) and successive projection algorithm (SPA) were used for features extraction. Subsequently, identification models of white tea from different provinces of China (DPC), different districts of Fujian Province (DDFP) and authenticity of Fuding white tea (AFWT) were established by K-nearest neighbors (KNN), random forest (RF) and support vector machine (SVM) algorithms. Among the established models, DPC-CWT-LDA-KNN, DDFP-OS-LDA-KNN and AFWT-OS-LDA-KNN have the best performances, with recognition accuracies of 88.97%, 93.88% and 97.96%, respectively; the area under curve (AUC) values were 0.85, 0.93 and 0.98, respectively. The research revealed that NIRS with machine learning algorithms can be an effective tool for the geographical origin traceability of white tea.
识别白茶的地理来源具有重要意义,因为不同产地的白茶由于生长环境和气候条件不同,其品质和价格差异很大。在本研究中,我们使用近红外光谱(NIRS)对579份白茶样本进行分析,以建立在不同条件下区分这些产地的模型。采用连续小波变换(CWT)、最小-最大归一化(Minmax)、多元散射校正(MSC)和标准正态变量变换(SNV)对原始光谱(OS)进行预处理。运用主成分分析(PCA)、线性判别分析(LDA)和连续投影算法(SPA)进行特征提取。随后,通过K近邻(KNN)、随机森林(RF)和支持向量机(SVM)算法建立了中国不同省份白茶(DPC)、福建省不同地区白茶(DDFP)以及福鼎白茶真伪(AFWT)的识别模型。在所建立的模型中,DPC-CWT-LDA-KNN、DDFP-OS-LDA-KNN和AFWT-OS-LDA-KNN表现最佳,识别准确率分别为88.97%、93.88%和97.96%;曲线下面积(AUC)值分别为0.85、0.93和0.98。研究表明,结合机器学习算法的近红外光谱技术可以成为白茶地理来源追溯的有效工具。