College of Biosystems Engineering and Food Science, Zhejiang University, Hangzhou, Zhejiang, 310058, China; Key Laboratory of on Site Processing Equipment for Agricultural Products, Ministry of Agriculture and Rural Affairs, China.
School of Geosciences and Info-Physics, Central South University, South Lushan Road, Changsha, 410000, China.
Anal Chim Acta. 2020 Jul 4;1119:41-51. doi: 10.1016/j.aca.2020.03.055. Epub 2020 Apr 8.
Deep learning approaches, especially convolutional neural network (CNN) models, have achieved excellent performances in vibrational spectral analysis. The critical drawback of the CNN approach is the lack of interpretation, and it is regarded as a black box. Interpreting the learning mechanism of chemometric models is critical for intuitive understanding and further application. In this study, an interpretable CNN model with a global average pooling layer is presented for Raman and mid-infrared spectral data analysis. A class activation mapping (CAM)-based approach is leveraged to visualize the active variables in the whole spectrum. The visualization of active variables shows a discriminative pattern in which the most contributed variables peaked around theoretical chemical characteristic bands. The visualization of the feature maps by three convolutional layers demonstrates the data transformation pipeline and how the CNN model hierarchically extracts informative spectral features. The first layer acts as a Savitzky-Golay filter and learns spectral shape characteristics, while the second layer learns enhanced patterns from typical spectral peaks on a few correlated variables. The third layer shows stable activations on critical spectral peaks. A partial least squares - linear discriminant analysis (PLS-LDA) model is presented for comparison on classification accuracy and model interpretation. The CNN model yields mean classification accuracies of 99.01 and 100% for E. coli and meat datasets on the test set, while the PLS-LDA models obtain accuracies of 98.83 and 100%. Both the CNN and PLS-LDA models demonstrate stable patterns on active variables while CNN models are more stable than PLS-LDA models on classification performances for various dataset partitions with Monte-Carlo cross-validation.
深度学习方法,尤其是卷积神经网络(CNN)模型,在振动光谱分析中取得了优异的性能。CNN 方法的关键缺点是缺乏解释性,被视为黑箱。解释化学计量学模型的学习机制对于直观理解和进一步应用至关重要。在这项研究中,提出了一种具有全局平均池化层的可解释 CNN 模型,用于拉曼和中红外光谱数据分析。利用基于类激活映射(CAM)的方法对整个光谱中的活跃变量进行可视化。活跃变量的可视化显示了一个有区别的模式,其中贡献最大的变量在理论化学特征带附近达到峰值。通过三个卷积层对特征图的可视化展示了数据转换管道以及 CNN 模型如何分层提取信息丰富的光谱特征。第一层充当 Savitzky-Golay 滤波器,学习光谱形状特征,而第二层从少数相关变量上的典型光谱峰学习增强模式。第三层在关键光谱峰上显示稳定的激活。提出了偏最小二乘-线性判别分析(PLS-LDA)模型进行分类准确性和模型解释的比较。在测试集上,对于大肠杆菌和肉类数据集,CNN 模型的平均分类准确率分别为 99.01%和 100%,而 PLS-LDA 模型的准确率分别为 98.83%和 100%。CNN 和 PLS-LDA 模型在活跃变量上都表现出稳定的模式,而在各种数据集分区的 Monte-Carlo 交叉验证中,CNN 模型在分类性能上比 PLS-LDA 模型更稳定。