Kanno Nanako, Kato Shingo, Ohkuma Moriya, Matsui Motomu, Iwasaki Wataru, Shigeto Shinsuke
Department of Chemistry, School of Science, Kwansei Gakuin University, 2-1 Gakuen, Sanda, Hyogo 669-1337, Japan.
Japan Collection of Microorganisms, RIKEN BioResource Research Center, 3-1-1 Koyadai, Tsukuba, Ibaraki 305-0074, Japan.
iScience. 2021 Aug 11;24(9):102975. doi: 10.1016/j.isci.2021.102975. eCollection 2021 Sep 24.
Accessing enormous uncultivated microorganisms (microbial dark matter) in various Earth environments requires accurate, nondestructive classification, and molecular understanding of the microorganisms in and at the single-cell level. Here we demonstrate a combined approach of random forest (RF) machine learning and single-cell Raman microspectroscopy for accurate classification of phylogenetically diverse prokaryotes (three bacterial and three archaeal species from different phyla). Our RF classifier achieved a 98.8 ± 1.9% classification accuracy among the six species in pure populations and 98.4% for three species in an artificially mixed population. Feature importance scores against each wavenumber reveal that the presence of carotenoids and structure of membrane lipids play key roles in distinguishing the prokaryotic species. We also find unique Raman markers for an ammonia-oxidizing archaeon. Our approach with moderate data pretreatment and intuitive visualization of feature importance is easy to use for non-spectroscopists, and thus offers microbiologists a new single-cell tool for shedding light on microbial dark matter.
获取各种地球环境中大量未培养的微生物(微生物暗物质)需要在单细胞水平及细胞层面上对微生物进行准确、无损的分类和分子理解。在此,我们展示了一种随机森林(RF)机器学习与单细胞拉曼光谱相结合的方法,用于对系统发育上不同的原核生物(来自不同门的三种细菌和三种古细菌)进行准确分类。我们的RF分类器在纯种群体的六个物种中实现了98.8±1.9%的分类准确率,在人工混合群体的三个物种中实现了98.4%的分类准确率。针对每个波数的特征重要性得分表明,类胡萝卜素的存在和膜脂结构在区分原核生物物种中起关键作用。我们还发现了一种氨氧化古菌的独特拉曼标记。我们的方法经过适度的数据预处理,且特征重要性直观可视化,非光谱学家也易于使用,因此为微生物学家提供了一种揭示微生物暗物质的新型单细胞工具。