Suppr超能文献

通过 SERS 实现机器学习框架的高灵敏度定性分析可视化。

Visualization of a Machine Learning Framework toward Highly Sensitive Qualitative Analysis by SERS.

机构信息

State Key Laboratory for Physical Chemistry of Solid Surfaces, College of Chemistry and Chemical Engineering, Xiamen University, Xiamen, Fujian 361005, China.

State Key Laboratory of Marine Environmental Science, Fujian Provincial Key Laboratory for Coastal Ecology and Environmental Studies, Center for Marine Environmental Chemistry & Toxicology, College of the Environment and Ecology, Xiamen University, Xiamen 361102, China.

出版信息

Anal Chem. 2022 Jul 19;94(28):10151-10158. doi: 10.1021/acs.analchem.2c01450. Epub 2022 Jul 6.

Abstract

Surface-enhanced Raman spectroscopy (SERS), providing near-single-molecule-level fingerprint information, is a powerful tool for the trace analysis of a target in a complicated matrix and is especially facilitated by the development of modern machine learning algorithms. However, both the high demand of mass data and the low interpretability of the mysterious black-box operation significantly limit the well-trained model to real systems in practical applications. Aiming at these two issues, we constructed a novel machine learning algorithm-based framework (Vis-CAD), integrating visual random forest, characteristic amplifier, and data augmentation. The introduction of data augmentation significantly reduced the requirement of mass data, and the visualization of the random forest clearly presented the captured features, by which one was able to determine the reliability of the algorithm. Taking the trace analysis of individual polycyclic aromatic hydrocarbons in a mixture as an example, a trustworthy accuracy no less than 99% was realized under the optimized condition. The visualization of the algorithm framework distinctly demonstrated that the captured feature was well correlated to the characteristic Raman peaks of each individual. Furthermore, the sensitivity toward the trace individual could be improved by least 1 order of magnitude as compared to that with the naked eye. The proposed algorithm distinguished by the lesser demand of mass data and the visualization of the operation process offers a new way for the indestructible application of machine learning algorithms, which would bring push-to-the-limit sensitivity toward the qualitative and quantitative analysis of trace targets, not only in the field of SERS, but also in the much wider spectroscopy world. It is implemented in the Python programming language and is open-source at https://github.com/3331822w/Vis-CAD.

摘要

表面增强拉曼光谱(SERS)提供了接近单分子级别的指纹信息,是一种强大的工具,可用于分析复杂基质中的目标物痕量分析,尤其得益于现代机器学习算法的发展。然而,大量数据的高需求和神秘黑盒操作的低可解释性极大地限制了训练有素的模型在实际应用中应用于真实系统。针对这两个问题,我们构建了一个基于新型机器学习算法的框架(Vis-CAD),该框架集成了可视随机森林、特征放大器和数据增强。数据增强的引入大大降低了对大量数据的需求,随机森林的可视化清晰地呈现了所捕获的特征,通过这些特征可以确定算法的可靠性。以混合物中单个多环芳烃的痕量分析为例,在优化条件下实现了可靠度不低于 99%的可信精度。算法框架的可视化清楚地表明,所捕获的特征与每个个体的特征拉曼峰密切相关。此外,与肉眼相比,痕量个体的灵敏度可提高至少 1 个数量级。与大量数据需求和操作过程可视化区分开来的算法通过较少的数据需求和操作过程的可视化提供了一种新的方法,用于不可破坏的机器学习算法应用,这将推动痕量目标的定性和定量分析的灵敏度达到极限,不仅在 SERS 领域,而且在更广泛的光谱学领域。它是用 Python 编程语言实现的,并在 https://github.com/3331822w/Vis-CAD 上开源。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验