Suppr超能文献

通过表面增强拉曼光谱的机器分类来测定痕量有机污染物浓度。

Determination of Trace Organic Contaminant Concentration via Machine Classification of Surface-Enhanced Raman Spectra.

机构信息

Department of Chemical and Materials Engineering, University of Alberta, Edmonton, Alberta T6G 1H9, Canada.

Department of Chemical Engineering, Kyungpook National University, Daegu 41566, Republic of Korea.

出版信息

Environ Sci Technol. 2024 Sep 3;58(35):15619-15628. doi: 10.1021/acs.est.3c06447. Epub 2024 Jan 25.

Abstract

Surface-enhanced Raman spectroscopy (SERS) has been well explored as a highly effective characterization technique that is capable of chemical pollutant detection and identification at very low concentrations. Machine learning has been previously used to identify compounds based on SERS spectral data. However, utilization of SERS to quantify concentrations, with or without machine learning, has been difficult due to the spectral intensity being sensitive to confounding factors such as the substrate parameters, orientation of the analyte, and sample preparation technique. Here, we demonstrate an approach for predicting the concentration of sample pollutants from SERS spectra using machine learning. Frequency domain transform methods, including the Fourier and Walsh-Hadamard transforms, are applied to spectral data sets of three analytes (rhodamine 6G, chlorpyrifos, and triclosan), which are then used to train machine learning algorithms. Using standard machine learning models, the concentration of the sample pollutants is predicted with >80% cross-validation accuracy from raw SERS data. A cross-validation accuracy of 85% was achieved using deep learning for a moderately sized data set (∼100 spectra), and 70-80% was achieved for small data sets (∼50 spectra). Performance can be maintained within this range even when combining various sample preparation techniques and environmental media interference. Additionally, as a spectral pretreatment, the Fourier and Hadamard transforms are shown to consistently improve prediction accuracy across multiple data sets. Finally, standard models were shown to accurately identify characteristic peaks of compounds via analysis of their importance scores, further verifying their predictive value.

摘要

表面增强拉曼光谱(SERS)已被广泛探索为一种非常有效的表征技术,能够在非常低的浓度下检测和识别化学污染物。机器学习以前曾用于根据 SERS 光谱数据识别化合物。然而,由于光谱强度对基质参数、分析物的取向和样品制备技术等混杂因素敏感,因此很难利用 SERS 定量浓度,无论是否使用机器学习。在这里,我们展示了一种使用机器学习从 SERS 光谱预测样品污染物浓度的方法。频域变换方法,包括傅里叶变换和沃尔什-哈达玛变换,应用于三种分析物(罗丹明 6G、毒死蜱和三氯生)的光谱数据集,然后用于训练机器学习算法。使用标准机器学习模型,从原始 SERS 数据中以>80%的交叉验证准确性预测样品污染物的浓度。对于中等大小的数据集(约 100 个光谱),使用深度学习可实现 85%的交叉验证准确性,对于较小的数据集(约 50 个光谱)可实现 70-80%的准确性。即使在结合各种样品制备技术和环境介质干扰的情况下,性能也可以保持在这个范围内。此外,作为光谱预处理,傅里叶变换和哈达玛变换被证明可以在多个数据集上始终提高预测准确性。最后,标准模型通过分析其重要性得分准确地识别了化合物的特征峰,进一步验证了它们的预测价值。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验