Suppr超能文献

基于机器学习的可解释拉曼特征实现病毒的准确识别。

Accurate virus identification with interpretable Raman signatures by machine learning.

机构信息

College of Information Sciences and Technology, The Pennsylvania State University, University Park, PA 16802.

Department of Physics, The Pennsylvania State University, University Park, PA 16802.

出版信息

Proc Natl Acad Sci U S A. 2022 Jun 7;119(23):e2118836119. doi: 10.1073/pnas.2118836119. Epub 2022 Jun 2.

Abstract

Rapid identification of newly emerging or circulating viruses is an important first step toward managing the public health response to potential outbreaks. A portable virus capture device, coupled with label-free Raman spectroscopy, holds the promise of fast detection by rapidly obtaining the Raman signature of a virus followed by a machine learning (ML) approach applied to recognize the virus based on its Raman spectrum, which is used as a fingerprint. We present such an ML approach for analyzing Raman spectra of human and avian viruses. A convolutional neural network (CNN) classifier specifically designed for spectral data achieves very high accuracy for a variety of virus type or subtype identification tasks. In particular, it achieves 99% accuracy for classifying influenza virus type A versus type B, 96% accuracy for classifying four subtypes of influenza A, 95% accuracy for differentiating enveloped and nonenveloped viruses, and 99% accuracy for differentiating avian coronavirus (infectious bronchitis virus [IBV]) from other avian viruses. Furthermore, interpretation of neural net responses in the trained CNN model using a full-gradient algorithm highlights Raman spectral ranges that are most important to virus identification. By correlating ML-selected salient Raman ranges with the signature ranges of known biomolecules and chemical functional groups—for example, amide, amino acid, and carboxylic acid—we verify that our ML model effectively recognizes the Raman signatures of proteins, lipids, and other vital functional groups present in different viruses and uses a weighted combination of these signatures to identify viruses.

摘要

快速识别新出现或传播的病毒是管理潜在疫情公共卫生应对措施的重要第一步。一种便携式病毒捕获设备,结合无标记拉曼光谱技术,有望通过快速获取病毒的拉曼特征来实现快速检测,然后应用机器学习 (ML) 方法根据其拉曼光谱识别病毒,该光谱用作指纹。我们提出了一种用于分析人类和禽病毒拉曼光谱的 ML 方法。专门为光谱数据设计的卷积神经网络 (CNN) 分类器在各种病毒类型或亚型识别任务中实现了非常高的准确性。特别是,它在区分 A 型和 B 型流感病毒方面的准确率达到 99%,在区分 A 型流感的四种亚型方面的准确率达到 96%,在区分包膜和非包膜病毒方面的准确率达到 95%,在区分禽冠状病毒(传染性支气管炎病毒 [IBV])与其他禽病毒方面的准确率达到 99%。此外,使用全梯度算法对经过训练的 CNN 模型中的神经网响应进行解释,突出了对病毒识别最重要的拉曼光谱范围。通过将 ML 选择的显著拉曼范围与已知生物分子和化学官能团的特征范围相关联——例如酰胺、氨基酸和羧酸——我们验证了我们的 ML 模型能够有效识别不同病毒中存在的蛋白质、脂质和其他重要官能团的拉曼特征,并使用这些特征的加权组合来识别病毒。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6c72/9191668/b277e900698c/pnas.2118836119fig01.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验