Tea Research Institute, Guizhou Academy of Agricultural Sciences, Guiyang 550025, China.
Sensors (Basel). 2024 May 24;24(11):3362. doi: 10.3390/s24113362.
As a non-destructive, fast, and cost-effective technique, near-infrared (NIR) spectroscopy has been widely used to determine the content of bioactive components in tea. However, due to the similar chemical structures of various catechins in black tea, the NIR spectra of black tea severely overlap in certain bands, causing nonlinear relationships and reducing analytical accuracy. In addition, the number of NIR spectral wavelengths is much larger than that of the modeled samples, and the small-sample learning problem is rather typical. These issues make the use of NIRS to simultaneously determine black tea catechins challenging. To address the above problems, this study innovatively proposed a wavelength selection algorithm based on feature interval combination sensitivity segmentation (FIC-SS). This algorithm extracts wavelengths at both coarse-grained and fine-grained levels, achieving higher accuracy and stability in feature wavelength extraction. On this basis, the study built four simultaneous prediction models for catechins based on extreme learning machines (ELMs), utilizing their powerful nonlinear learning ability and simple model structure to achieve simultaneous and accurate prediction of catechins. The experimental results showed that for the full spectrum, the ELM model has better prediction performance than the partial least squares model for epicatechin (EC), epicatechin gallate (ECG), epigallocatechin (EGC), and epigallocatechin gallate (EGCG). For the feature wavelengths, our proposed FIC-SS-ELM model enjoys higher prediction performance than ELM models based on other wavelength selection algorithms; it can simultaneously and accurately predict the content of EC (Rp2 = 0.91, RMSEP = 0.019), ECG (Rp2 = 0.96, RMSEP = 0.11), EGC (Rp2 = 0.97, RMSEP = 0.15), and EGCG (Rp2 = 0.97, RMSEP = 0.35) in black tea. The results of this study provide a new method for the quantitative determination of the bioactive components of black tea.
作为一种非破坏性、快速且具有成本效益的技术,近红外(NIR)光谱已广泛用于测定茶中生物活性成分的含量。然而,由于红茶中各种儿茶素的化学结构相似,导致红茶的 NIR 光谱在某些波段严重重叠,从而产生非线性关系并降低分析准确性。此外,NIR 光谱波长的数量远大于建模样本的数量,因此存在小样本学习问题,这是一个非常典型的问题。这些问题使得使用 NIRS 同时测定红茶儿茶素有一定的难度。为了解决上述问题,本研究创新性地提出了一种基于特征区间组合敏感分割(FIC-SS)的波长选择算法。该算法在粗粒度和细粒度水平上提取波长,实现了特征波长提取的更高精度和稳定性。在此基础上,研究基于极限学习机(ELM)建立了四个儿茶素的同时预测模型,利用其强大的非线性学习能力和简单的模型结构,实现了儿茶素的同时准确预测。实验结果表明,对于全谱,ELM 模型在预测表儿茶素(EC)、表没食子儿茶素没食子酸酯(ECG)、表儿茶素没食子酸酯(EGC)和表没食子儿茶素没食子酸酯(EGCG)方面的预测性能优于偏最小二乘法模型。对于特征波长,我们提出的 FIC-SS-ELM 模型比基于其他波长选择算法的 ELM 模型具有更高的预测性能;它可以同时准确地预测红茶中 EC(Rp2 = 0.91,RMSEP = 0.019)、ECG(Rp2 = 0.96,RMSEP = 0.11)、EGC(Rp2 = 0.97,RMSEP = 0.15)和 EGCG(Rp2 = 0.97,RMSEP = 0.35)的含量。本研究结果为红茶生物活性成分的定量测定提供了一种新方法。