Suppr超能文献

可解释的机器学习模型通过光谱学对矿物进行分类。

Interpretable machine learning models classify minerals via spectroscopy.

作者信息

Smith R, Spano Tyler L, McDonnell Marshall, Drane Lance, Gibbs Ian, Miskowiec Andrew, Niedziela J L, Shields Ashley E

机构信息

Oak Ridge National Laboratory, One Bethel Valley Road, Oak Ridge, TN, United States.

出版信息

Sci Rep. 2025 May 6;15(1):15807. doi: 10.1038/s41598-025-92686-2.

Abstract

Developing methods to identify mineral species confidently and rapidly from Raman spectral analysis is critical to numerous fields. Traditionally, analysis relies on pattern matching the Raman spectrum of an unknown dataset with a supporting library of well-characterized spectral data, which may prove difficult for environmental samples that are poorly crystalline or phase mixtures. Here, we developed interpretable machine learning models that can classify uranium minerals by secondary oxyanion chemistry and other physicochemical properties based solely on Raman spectra. This new ML method produces a mineral profile of physical and chemical properties for an unknown sample and can rapidly classify or identify unknown minerals from Raman data, without the need for an exact pattern match in a spectral library. Training models are validated by 1. Strong correlation of high confidence model regions with published spectroscopic assignments and 2. Correct classification of a mineral not present in training data. Training data are from the Compendium of Uranium Raman and Infrared Experimental Spectra and available crystallographic information files within the open-source Smart Spectral Matching scientific framework. Physically meaningful classifier models can rapidly identify key structural and chemical information about unknown uranium minerals and the overall methodology is broadly applicable for mineral phases.

摘要

开发能够通过拉曼光谱分析准确快速地识别矿物种类的方法对众多领域至关重要。传统上,分析依赖于将未知数据集的拉曼光谱与特征明确的光谱数据支持库进行模式匹配,对于结晶性差或为相混合物的环境样品来说,这可能会很困难。在此,我们开发了可解释的机器学习模型,该模型能够仅基于拉曼光谱,通过次氧阴离子化学性质和其他物理化学性质对铀矿物进行分类。这种新的机器学习方法可为未知样品生成物理和化学性质的矿物概况,并且能够从拉曼数据中快速对未知矿物进行分类或识别,而无需在光谱库中进行精确的模式匹配。训练模型通过以下方式进行验证:1. 高置信度模型区域与已发表的光谱归属之间具有强相关性;2. 对训练数据中不存在的矿物进行正确分类。训练数据来自《铀拉曼和红外实验光谱汇编》以及开源智能光谱匹配科学框架内可用的晶体学信息文件。具有物理意义的分类器模型能够快速识别未知铀矿物的关键结构和化学信息,并且整体方法广泛适用于矿物相。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d788/12056053/c53090b8b1e6/41598_2025_92686_Fig1_HTML.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验