Ning Hanyang, Ma Miao, Shi Zhiwei, Ding Liping
The School of Computer Science, Shaanxi Normal University, Xi'an, 710119, Shaanxi, China.
Institute of New Concept Sensors and Molecular Materials, Shaanxi Normal University, Xi'an, 710119, Shaanxi, China.
J Mol Model. 2025 Jul 26;31(8):218. doi: 10.1007/s00894-025-06440-6.
The unregulated use of anionic surfactants poses significant environmental risks, necessitating methods for their rapid and accurate identification. While fluorescence spectroscopy is a powerful tool, its application faces a critical challenge: existing analytical strategies either rely on complex and costly sensor arrays to acquire rich data, or they apply traditional machine learning to simpler, single-spectrum data, which often requires pre-processing steps like PCA that risk information loss. Furthermore, standard deep learning approaches are often unsuitable due to the high cost and effort required to acquire the large datasets they need for training. To address this gap, we propose an end-to-end, few-shot learning method (CNN-PN) for the classification of anionic surfactant fluorescence emission spectra. Our approach leverages a one-dimensional convolutional neural network (1D-CNN) to automatically extract features from the full, raw spectrum, thus avoiding lossy pre-processing. It then employs a prototypical network to perform robust, similarity-based classification, a strategy highly effective for limited sample sizes. We validated our method on our FESS dataset (53 surfactant categories) and a public metal oxides dataset. In our experiments, the CNN-PN method consistently outperformed traditional techniques like LDA, SVM, and KNN. It achieved 76.36% accuracy when trained with only a single sample per class, 95.90% in a multi-sample scenario on our FESS dataset, and 84.86% on the public dataset. This work provides a powerful and data-efficient framework for spectral analysis, facilitating the development of more accessible and rapid fluorescence sensing technologies, particularly for applications where data collection is expensive or constrained.
A few-shot learning classification method based on prototypical networks was employed. A one-dimensional convolutional neural network (1D-CNN) was utilized to extract spectral features from the full fluorescence emission spectra. Classification was then performed within the prototypical network framework using Euclidean distance as the similarity metric between features in the learned latent space. The Python programming language and the PyTorch library were used for all model implementations and data analysis.
阴离子表面活性剂的无节制使用带来了重大环境风险,因此需要能够快速准确识别它们的方法。虽然荧光光谱法是一种强大的工具,但其应用面临一项关键挑战:现有的分析策略要么依赖复杂且昂贵的传感器阵列来获取丰富数据,要么将传统机器学习应用于更简单的单光谱数据,而这通常需要像主成分分析(PCA)这样有信息丢失风险的预处理步骤。此外,标准的深度学习方法通常也不合适,因为获取它们训练所需的大型数据集成本高且难度大。为了弥补这一差距,我们提出了一种用于阴离子表面活性剂荧光发射光谱分类的端到端少样本学习方法(CNN-PN)。我们的方法利用一维卷积神经网络(1D-CNN)从完整的原始光谱中自动提取特征,从而避免有损预处理。然后它采用原型网络进行稳健的、基于相似度的分类,这一策略对于有限样本量非常有效。我们在我们的FESS数据集(53种表面活性剂类别)和一个公共金属氧化物数据集上验证了我们的方法。在我们的实验中,CNN-PN方法始终优于诸如线性判别分析(LDA)、支持向量机(SVM)和K近邻(KNN)等传统技术。在每类仅用单个样本训练时,它的准确率达到76.36%,在我们的FESS数据集的多样本场景中为95.90%,在公共数据集上为84.86%。这项工作为光谱分析提供了一个强大且数据高效的框架,促进了更易获取和快速的荧光传感技术的发展,特别是对于数据收集成本高或受限的应用。
采用了一种基于原型网络的少样本学习分类方法。利用一维卷积神经网络(1D-CNN)从完整的荧光发射光谱中提取光谱特征。然后在原型网络框架内使用欧几里得距离作为学习到的潜在空间中特征之间的相似度度量进行分类。所有模型实现和数据分析均使用Python编程语言和PyTorch库。