Department of Bio and Brain Engineering, Korea Advanced Institute of Science and Technology (KAIST), Daejeon 34141, Korea.
Department of Mechanical System Engineering, Kumoh National Institute of Technology, Gumi 39177, Korea.
Biosensors (Basel). 2021 Nov 30;11(12):490. doi: 10.3390/bios11120490.
Surface-Enhanced Raman Spectroscopy (SERS)-based biomolecule detection has been a challenge due to large variations in signal intensity, spectral profile, and nonlinearity. Recent advances in machine learning offer great opportunities to address these issues. However, well-documented procedures for model development and evaluation, as well as benchmark datasets, are lacking. Towards this end, we provide the SERS spectral benchmark dataset of Rhodamine 6G (6) for a molecule detection task and evaluate the classification performance of several machine learning models. We also perform a comparative study to find the best combination between the preprocessing methods and the machine learning models. Our best model, coined as the SERSNet, robustly identifies 6 molecule with excellent independent test performance. In particular, SERSNet shows 95.9% balanced accuracy for the cross-batch testing task.
基于表面增强拉曼光谱(SERS)的生物分子检测一直是一个挑战,因为信号强度、光谱轮廓和非线性存在很大差异。机器学习的最新进展为此提供了很好的机会。然而,在模型开发和评估方面缺乏有文件记录的程序以及基准数据集。为此,我们提供了用于分子检测任务的 Rhodamine 6G(6)的 SERS 光谱基准数据集,并评估了几种机器学习模型的分类性能。我们还进行了比较研究,以找到预处理方法和机器学习模型之间的最佳组合。我们的最佳模型,称为 SERSNet,能够稳健地识别 6 分子,具有出色的独立测试性能。特别是,SERSNet 在跨批次测试任务中表现出 95.9%的平衡准确率。