Guo Xue, Bai Rui-Bin, Wang Hui, Li Wei-Wen, Dong Ling, Sun Jia-Hui, Zhang Xiao-Bo, Yang Jian
National Key Laboratory for Quality Ensurance and Sustainable Use of Dao-di Herbs, National Resource Center for Chinese Materia Medica, China Academy of Chinese Medical Sciences Beijing 100700, China.
Key Laboratory of Horticultural Crop Germplasm innovation and Utilization (Co-construction by Ministry and Province), Institute of Horticulture,Anhui Academy of Agricultural Sciences Hefei 230001, China.
Zhongguo Zhong Yao Za Zhi. 2024 Nov;49(22):6073-6081. doi: 10.19540/j.cnki.cjcmm.20240814.101.
Gongju(Chrysanthemum morifolium) is one of the five major medicinal Chrysanthemum varieties included in the Chinese Pharmacopoeia. In recent years, its cultivation areas have changed significantly, resulting in mixed quality of the medicinal herbs. In this study, Gongju cultivated in Anhui, Yunnan, Chongqing, and other places were selected as research objects. Hyperspectral data were collected in the visible-near-infrared(VNIR) and short-wave infrared(SWIR) bands using different modes, such as corolla facing up(A) and flower base facing up(B). After pre-processing the hyperspectral data using five methods, including multiplicative scatter correction(MSC), Savitzky-Golay smoothing(SG), first derivative(D1), second derivative(D2), and standard normal variate(SNV), partial least squares discriminant analysis(PLSDA), random forest(RF), and support vector machine(SVM) were used to establish origin identification models of Gongju at the two geographical scales of the province and the city-county in Anhui province. The accuracy of the prediction results was used as an evaluation index to select the optimal models, and the classification performance of the models was evaluated by confusion matrix. The results showed that the flower base facing up(B) collection model combined with second derivative pretreatment and RF method was the best model for both geographical scale identification models. The modeling effect of the full-band(VNIR + SWIR) was slightly better than that of the single band, with the accuracy of the prediction set in the province and city-county regions reaching 99.69% and 99.40%, respectively. The competitive adaptive reweighted sampling algorithm(CARS), successive projections algorithm(SPA), and variable iterative space shrinkage approach(VISSA) were further used to screen the feature wavelength modeling. The number of feature wavelengths screened by CARS was fewer, and the prediction set accuracy of the two geographical scales models after optimization could reach 99.56% and 98.65%, which was basically comparable to the full-band model. However, the removal of redundant variables could greatly reduce the complexity of the model. The hyperspectral technology combined with the chemometrics model established in this study can achieve the origin identification of Gongju at different geographical scales, providing a theoretical basis and technical reference for the construction of a rapid detection system for Gongju origin and the development of exclusive miniaturized instrumentation and equipment systems.
贡菊(菊花)是《中国药典》收录的五大药用菊花品种之一。近年来,其种植区域发生了显著变化,导致药材质量参差不齐。本研究选取安徽、云南、重庆等地种植的贡菊作为研究对象。采用不同模式,如花冠朝上(A)和花基朝上(B),在可见 - 近红外(VNIR)和短波红外(SWIR)波段采集高光谱数据。使用乘法散射校正(MSC)、Savitzky - Golay平滑(SG)、一阶导数(D1)、二阶导数(D2)和标准正态变量变换(SNV)这五种方法对高光谱数据进行预处理后,利用偏最小二乘判别分析(PLSDA)、随机森林(RF)和支持向量机(SVM)建立安徽省省级和市县级两个地理尺度的贡菊产地识别模型。以预测结果的准确率作为评价指标筛选最优模型,并通过混淆矩阵评估模型的分类性能。结果表明,花基朝上(B)采集模式结合二阶导数预处理和RF方法是两个地理尺度识别模型中的最佳模型。全波段(VNIR + SWIR)的建模效果略优于单波段,省级和市县级区域预测集的准确率分别达到99.69%和99.40%。进一步使用竞争性自适应重加权采样算法(CARS)、连续投影算法(SPA)和可变迭代空间收缩方法(VISSA)进行特征波长建模筛选。CARS筛选出的特征波长数量较少,优化后的两个地理尺度模型预测集准确率分别可达99.56%和98.65%,基本与全波段模型相当。然而,去除冗余变量可大大降低模型的复杂度。本研究建立的高光谱技术与化学计量学模型相结合,可实现不同地理尺度贡菊的产地识别,为贡菊产地快速检测系统的构建以及专用小型化仪器设备系统的开发提供理论依据和技术参考。