Suppr超能文献

基于特征聚合深度学习的功能基团红外光谱预测。

Infrared Spectral Analysis for Prediction of Functional Groups Based on Feature-Aggregated Deep Learning.

机构信息

The State Key Laboratory of Chemical Oncogenomics, Shenzhen International Graduate School, Tsinghua University, Shenzhen 518055, P. R. China.

Open FIESTA, Shenzhen International Graduate School, Tsinghua University, Shenzhen 518055, P. R. China.

出版信息

J Chem Inf Model. 2023 Aug 14;63(15):4615-4622. doi: 10.1021/acs.jcim.3c00749. Epub 2023 Aug 2.

Abstract

Infrared (IR) spectroscopy is a powerful and versatile tool for analyzing functional groups in organic compounds. A complex and time-consuming interpretation of massive unknown spectra usually requires knowledge of chemistry and spectroscopy. This paper presents a new deep learning method for transforming IR spectral features into intuitive imagelike feature maps and prediction of major functional groups. We obtained 8272 gas-phase IR spectra from the NIST Chemistry WebBook. Feature maps are constructed using the intrinsic correlation of spectral data, and prediction models are developed based on convolutional neural networks. Twenty-one major functional groups for each molecule are successfully identified using binary and multilabel models without expert guidance and feature selection. The multilabel classification model can produce all prediction results simultaneously for rapid characterization. Further analysis of the detailed substructures indicates that our model is capable of obtaining abundant structural information from IR spectra for a comprehensive investigation. The interpretation of our model reveals that the peaks of most interest are similar to those often considered by spectroscopists. In addition to demonstrating great potential for spectral identification, our method may contribute to the development of automated analyses in many fields.

摘要

红外(IR)光谱学是分析有机化合物官能团的强大而通用的工具。大量未知光谱的复杂且耗时的解释通常需要化学和光谱学知识。本文提出了一种新的深度学习方法,用于将 IR 光谱特征转换为直观的图像特征图,并预测主要官能团。我们从 NIST Chemistry WebBook 获得了 8272 种气相 IR 光谱。使用光谱数据的固有相关性构建特征图,并基于卷积神经网络开发预测模型。在没有专家指导和特征选择的情况下,成功使用二进制和多标签模型识别每个分子的 21 种主要官能团。多标签分类模型可以同时生成所有预测结果,以实现快速特征描述。对详细子结构的进一步分析表明,我们的模型能够从 IR 光谱中获取丰富的结构信息,以进行全面研究。对我们模型的解释表明,最感兴趣的峰与光谱学家通常考虑的峰相似。除了展示在光谱识别方面的巨大潜力外,我们的方法还可能有助于许多领域的自动化分析的发展。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验