Department of Mechanical and Mechatronics Engineering, University of Auckland, Auckland1010, New Zealand.
Dodd-Walls Centre for Photonic and Quantum Technologies, Dunedin9054, New Zealand.
Anal Chem. 2022 Sep 20;94(37):12907-12918. doi: 10.1021/acs.analchem.2c03082. Epub 2022 Sep 6.
Machine learning has had a significant impact on the value of spectroscopic characterization tools, particularly in biomedical applications, due to its ability to detect latent patterns within complex spectral data. However, it often requires extensive data preprocessing, including baseline correction and denoising, which can lead to an unintentional bias during classification. To address this, we developed two deep learning methods capable of fully preprocessing raw Raman spectroscopy data without any human input. First, cascaded deep convolutional neural networks (CNN) based on either ResNet or U-Net architectures were trained on randomly generated spectra with augmented defects. Then, they were tested using simulated Raman spectra, surface-enhanced Raman spectroscopy (SERS) imaging of chemical species, low resolution Raman spectra of human bladder cancer tissue, and finally, classification of SERS spectra from human placental extracellular vesicles (EVs). Both approaches resulted in faster training and complete spectral preprocessing in a single step, with more speed, defect tolerance, and classification accuracy compared to conventional methods. These findings indicate that cascaded CNN preprocessing is ideal for biomedical Raman spectroscopy applications in which large numbers of heterogeneous spectra with diverse defects need to be automatically, rapidly, and reproducibly preprocessed.
机器学习对光谱特征工具的价值产生了重大影响,特别是在生物医学应用中,因为它能够在复杂的光谱数据中检测潜在模式。然而,它通常需要广泛的数据预处理,包括基线校正和去噪,这可能会在分类过程中导致无意识的偏差。为了解决这个问题,我们开发了两种能够完全预处理原始拉曼光谱数据的深度学习方法,而无需任何人工输入。首先,基于 ResNet 或 U-Net 架构的级联深度卷积神经网络 (CNN) 在具有增强缺陷的随机生成光谱上进行训练。然后,使用模拟拉曼光谱、化学物质的表面增强拉曼光谱成像、人膀胱癌组织的低分辨率拉曼光谱以及最后对人胎盘细胞外囊泡 (EVs) 的 SERS 光谱进行分类来测试它们。这两种方法都实现了更快的训练和在单个步骤中完成光谱预处理,与传统方法相比,速度更快、容错性更强、分类准确性更高。这些发现表明,级联 CNN 预处理非常适合需要自动、快速和可重复地预处理大量具有不同缺陷的异构光谱的生物医学拉曼光谱应用。