State Key Laboratory of Precision Measuring Technology and Instruments, Tianjin University, Tianjin 300072, PR China; School of Precision Instruments and Optoelectronics Engineering, Tianjin University, Tianjin 300072, PR China.
State Key Laboratory of Precision Measuring Technology and Instruments, Tianjin University, Tianjin 300072, PR China; School of Precision Instruments and Optoelectronics Engineering, Tianjin University, Tianjin 300072, PR China.
Spectrochim Acta A Mol Biomol Spectrosc. 2021 May 5;252:119475. doi: 10.1016/j.saa.2021.119475. Epub 2021 Jan 16.
High-oil corn is a high-quality variety of corn possessing higher oil content with greater caloric energy than normal corn. Hence, controlling the purity and authenticity of high-oil corn is of great importance in current crop research. The aim of this study is to develop a novel method for corn variety discrimination using Terahertz (THz) spectroscopy and signal classification analysis. In brief, the method involves feature extraction and variable selection of raw signals from Terahertz time-domain waveforms (THz-TDW) and absorption spectrum (THz-AS), and the use of classifiers on those treated signals to establish the discrimination models. Principle component analysis (PCA) were used for feature extraction with THz-TDW, while three different methods of variable selection were implemented with THz-AS, including uninformative variables elimination (UVE), uninformative variables elimination-successive projections algorithm (UVE-SPA) and competitive adaptive reweighted sampling (CARS). Then, two classification algorithms, Linear discriminant analysis (LDA) and support vector machine (SVM), were employed and compared in the discrimination models. Bootstrapped Latin partitions (BLP) method with 10 bootstraps and 5 Latin-partitions was applied to validate these models. Our modeling results suggest SVM as the better classification algorithm achieving higher identifying accuracy, such that the PCA-SVM model for THz-TDW has achieved 94.7% accuracy. The results also indicate variable selection as an important step to create an accurate and robust discrimination model for THZ-AS. The CARS-SVM model with radial basic function (RBF) has achieved 100% average accuracy in prediction set, while the UVE-SVM and UVE-SPA-SVM have achieved 91.2% and 99.1% accuracy, respectively. These results demonstrate that high-oil corn and normal corn can be identified successfully by using THz spectroscopy with discriminant analysis, suggesting our techniques to provide an efficient and practical reference for classifying crop varieties in agriculture research, while expanding the application of THz spectroscopy in the related field.
高油玉米是一种优质玉米品种,其油含量比普通玉米高,热量也更高。因此,控制高油玉米的纯度和真实性在当前的作物研究中非常重要。本研究旨在开发一种利用太赫兹(THz)光谱和信号分类分析对玉米品种进行区分的新方法。简而言之,该方法涉及从太赫兹时域波形(THz-TDW)和吸收光谱(THz-AS)的原始信号中进行特征提取和变量选择,并在这些处理后的信号上使用分类器来建立判别模型。太赫兹时域波形的特征提取采用主成分分析(PCA),而太赫兹吸收光谱则采用三种不同的变量选择方法,包括不相关变量消除(UVE)、不相关变量消除-连续投影算法(UVE-SPA)和竞争自适应重加权采样(CARS)。然后,使用线性判别分析(LDA)和支持向量机(SVM)两种分类算法在判别模型中进行比较。采用 10 次自举和 5 次拉丁分区的 Bootstrapped Latin partitions(BLP)方法对这些模型进行验证。我们的建模结果表明,SVM 是更好的分类算法,具有更高的识别准确率,例如,THz-TDW 的 PCA-SVM 模型达到了 94.7%的准确率。结果还表明,变量选择是创建 THZ-AS 准确稳健的判别模型的重要步骤。使用径向基函数(RBF)的 CARS-SVM 模型在预测集中达到了 100%的平均准确率,而 UVE-SVM 和 UVE-SPA-SVM 分别达到了 91.2%和 99.1%的准确率。这些结果表明,利用太赫兹光谱和判别分析可以成功识别高油玉米和普通玉米,为农业研究中作物品种的分类提供了一种高效实用的参考方法,同时也拓展了太赫兹光谱在相关领域的应用。