Balytskyi Yaroslav, Kalashnyk Nataliia, Hubenko Inna, Balytska Alina, McNear Kelly
Department of Physics and Astronomy, Wayne State University, Detroit, Michigan 48201, United States.
National University of Civil Protection of Ukraine, Cherkasy 18034, Ukraine.
Chem Biomed Imaging. 2024 May 6;2(6):442-452. doi: 10.1021/cbmi.4c00007. eCollection 2024 Jun 24.
The combination of deep learning techniques and Raman spectroscopy shows great potential offering precise and prompt identification of pathogenic bacteria in clinical settings. However, the traditional closed-set classification approaches assume that all test samples belong to one of the known pathogens, and their applicability is limited since the clinical environment is inherently unpredictable and dynamic, unknown, or emerging pathogens may not be included in the available catalogs. We demonstrate that the current state-of-the-art neural networks identifying pathogens through Raman spectra are vulnerable to unknown inputs, resulting in an uncontrollable false positive rate. To address this issue, first we developed an ensemble of ResNet architectures combined with the attention mechanism that achieves a 30-isolate accuracy of 87.8 ± 0.1%. Second, through the integration of feature regularization by the Objectosphere loss function, our model both achieves high accuracy in identifying known pathogens from the catalog and effectively separates unknown samples drastically reducing the false positive rate. Finally, the proposed feature regularization method during training significantly enhances the performance of out-of-distribution detectors during the inference phase improving the reliability of the detection of unknown classes. Our algorithm for Raman spectroscopy empowers the identification of previously unknown, uncataloged, and emerging pathogens ensuring adaptability to future pathogens that may surface. Moreover, it can be extended to enhance open-set medical image classification, bolstering its reliability in dynamic operational settings.
深度学习技术与拉曼光谱相结合,在临床环境中精确快速地识别病原菌方面显示出巨大潜力。然而,传统的封闭集分类方法假定所有测试样本都属于已知病原体之一,由于临床环境本质上不可预测且动态变化,其适用性有限,现有目录中可能不包括未知或新出现的病原体。我们证明,当前通过拉曼光谱识别病原体的最先进神经网络容易受到未知输入的影响,导致误报率无法控制。为了解决这个问题,首先我们开发了一种结合注意力机制的ResNet架构集成,其对30种分离株的识别准确率达到87.8±0.1%。其次,通过Objectosphere损失函数进行特征正则化,我们的模型在从目录中识别已知病原体方面既实现了高精度,又能有效分离未知样本,大幅降低误报率。最后,训练过程中提出的特征正则化方法在推理阶段显著提高了分布外检测器的性能,提升了未知类检测的可靠性。我们的拉曼光谱算法能够识别以前未知、未编入目录和新出现的病原体,确保对未来可能出现的病原体具有适应性。此外,它可以扩展以增强开放集医学图像分类,提高其在动态操作环境中的可靠性。
Spectrochim Acta A Mol Biomol Spectrosc. 2023-12-15
IEEE J Biomed Health Inform. 2022-1
Sci Total Environ. 2020-4-4
Chem Biomed Imaging. 2025-7-4
J Med Imaging (Bellingham). 2022-1
IEEE J Biomed Health Inform. 2022-1
Anal Bioanal Chem. 2021-9
J Clin Microbiol. 2021-7-19