Bitra Venkat Suprabath, Verma Shweta, Rao B Tirumala
International Institute of Information Technology Bangalore, Electronic City, Bengaluru, Karnataka, 560100, India.
Laser Materials Processing Division, Raja Ramanna Centre for Advanced Technology, Indore, Madhya Pradesh, 452013, India.
Anal Chim Acta. 2024 Sep 15;1322:343063. doi: 10.1016/j.aca.2024.343063. Epub 2024 Aug 5.
Upcoming inexpensive, compact Internet of Things (IoT) microcontrollers i.e., tiny-machine learning (TinyML) takes the ML driven Raman spectroscopy one step ahead for realization of more affordable and highly compact field deployable instruments. Further, lack of large spectral datasets and need for numerous specialized SERS substrates impede the development of various ML-based surface enhanced Raman spectroscopy (SERS) applications. The aim is to introduce TinyML analysis on wide range of spectra classes using customized dataset obtained with low-cost SERS. In this regard, it is vital to establish an optimum ML model and efficient data handling methodology for low memory TinyML units.
We introduce a novel TinyML methodology for accurate classification of large spectra classes with smartphone assistance for data communication and results visualization. To generate large customized spectral dataset, we present a facile, micro-drop SERS using Au colloids and reusable grooved Al substrates. The results demonstrated that memory efficient 8-bit data quantization based convolutional neural network is effective for accurate identification of 22 different spectra classes of trace dye-pesticide mixtures and pharmaceuticals. In this novel quantized data analysis on significantly varied intensity and complex variation spectra classes i.e., many individual, binary-mixtures and some with varied compositions, data normalization is shown to be powerful for improving ML classification accuracy from about 55 % to >99.5 %. Its robustness is demonstrated using inter-instrument driven data variations such as spectral shifts, high noise, and abscissa-flip, with five-fold cross validation of model performance. In addition, on-site quantification of analyte through spectral intensity is also demonstrated.
This study opens up a new approach of ML analysis towards realization of next generation field deployable analytical instruments maintaining data privacy. It presents a detailed procedure of quantized spectral data analysis and its implementation in TinyML, attractive for various users and instrument manufacturers. The presented innovative computer-free ML analysis can be employed in all types of spectrometers, meeting the common goal of Raman spectroscopy i.e., accurate identification of complex spectra classes.
即将出现的低成本、紧凑型物联网(IoT)微控制器,即 TinyML,使基于机器学习的拉曼光谱技术向前迈进了一步,有助于实现更经济实惠且高度紧凑的可现场部署仪器。此外,缺乏大量光谱数据集以及对众多专用表面增强拉曼光谱(SERS)底物的需求阻碍了各种基于机器学习的 SERS 应用的发展。本研究旨在利用通过低成本 SERS 获得的定制数据集,对广泛的光谱类别进行 TinyML 分析。在这方面,为低内存 TinyML 单元建立最佳机器学习模型和高效数据处理方法至关重要。
我们引入了一种新颖的 TinyML 方法,借助智能手机进行数据通信和结果可视化,以准确分类大型光谱类别。为生成大量定制光谱数据集,我们展示了一种使用金胶体和可重复使用的带槽铝基板的简便微滴 SERS 方法。结果表明,基于内存高效的 8 位数据量化的卷积神经网络对于准确识别痕量染料 - 农药混合物和药物的 22 种不同光谱类别有效。在对强度差异显著且光谱变化复杂的类别(即许多单一成分、二元混合物以及一些成分各异的混合物)进行的这种新型量化数据分析中,数据归一化显示出强大作用,可将机器学习分类准确率从约 55%提高到>99.5%。通过仪器间驱动的数据变化(如光谱偏移、高噪声和横坐标翻转)以及模型性能的五重交叉验证,证明了其稳健性。此外,还展示了通过光谱强度对分析物进行现场定量分析。
本研究开辟了一种机器学习分析的新方法,有助于实现下一代可现场部署的分析仪器并维护数据隐私。它展示了量化光谱数据分析的详细过程及其在 TinyML 中的应用实施,对各类用户和仪器制造商具有吸引力。所提出的创新型无计算机机器学习分析可应用于所有类型的光谱仪,符合拉曼光谱的共同目标,即准确识别复杂光谱类别。