College of Biosystems Engineering and Food Science, Zhejiang University, Hangzhou, Zhejiang, 310058, China; Key Laboratory of On Site Processing Equipment for Agricultural Products, Ministry of Agriculture and Rural Affairs, China.
College of Biosystems Engineering and Food Science, Zhejiang University, Hangzhou, Zhejiang, 310058, China; Key Laboratory of On Site Processing Equipment for Agricultural Products, Ministry of Agriculture and Rural Affairs, China; Faculty of Agricultural and Food Science, Zhejiang A&F University, Hangzhou, Zhejiang, 311300, China.
Anal Chim Acta. 2019 Jun 13;1058:48-57. doi: 10.1016/j.aca.2019.01.002. Epub 2019 Jan 8.
Learning patterns from spectra is critical for the development of chemometric analysis of spectroscopic data. Conventional two-stage calibration approaches consist of data preprocessing and modeling analysis. Misuse of preprocessing may introduce artifacts or remove useful patterns and result in worse model performance. An end-to-end deep learning approach incorporated Inception module, named DeepSpectra, is presented to learn patterns from raw data to improve the model performance. DeepSpectra model is compared to three CNN models on the raw data, and 16 preprocessing approaches are included to evaluate the preprocessing impact by testing four open accessed visible and near infrared spectroscopic datasets (corn, tablets, wheat, and soil). DeepSpectra model outperforms the other three convolutional neural network models on four datasets and obtains better results on raw data than in preprocessed data for most scenarios. The model is compared with linear partial least square (PLS) and nonlinear artificial neural network (ANN) methods and support vector machine (SVR) on raw and preprocessed data. The results show that DeepSpectra approach provides improved results than conventional linear and nonlinear calibration approaches in most scenarios. The increased training samples can improve the model repeatability and accuracy.
从光谱中学习模式对于开发光谱数据分析的化学计量学分析至关重要。传统的两阶段校准方法包括数据预处理和建模分析。预处理的误用可能会引入伪影或去除有用的模式,并导致模型性能下降。本文提出了一种端到端的深度学习方法,该方法结合了 Inception 模块,名为 DeepSpectra,用于从原始数据中学习模式,以提高模型性能。将 DeepSpectra 模型与三个 CNN 模型在原始数据上进行比较,并包括 16 种预处理方法,通过测试四个公开的可见和近红外光谱数据集(玉米、片剂、小麦和土壤)来评估预处理的影响。DeepSpectra 模型在四个数据集上均优于其他三个卷积神经网络模型,并且在大多数情况下,在原始数据上的结果优于预处理数据。该模型与线性偏最小二乘 (PLS) 和非线性人工神经网络 (ANN) 方法以及支持向量机 (SVR) 在原始数据和预处理数据上进行了比较。结果表明,在大多数情况下,DeepSpectra 方法提供的结果优于传统的线性和非线性校准方法。增加训练样本可以提高模型的可重复性和准确性。