Kyathanahally Sreenath P, Hardeman Thomas, Merz Ewa, Bulas Thea, Reyes Marta, Isles Peter, Pomati Francesco, Baity-Jesi Marco
Eawag, Dübendorf, Switzerland.
Front Microbiol. 2021 Nov 15;12:746297. doi: 10.3389/fmicb.2021.746297. eCollection 2021.
Plankton are effective indicators of environmental change and ecosystem health in freshwater habitats, but collection of plankton data using manual microscopic methods is extremely labor-intensive and expensive. Automated plankton imaging offers a promising way forward to monitor plankton communities with high frequency and accuracy in real-time. Yet, manual annotation of millions of images proposes a serious challenge to taxonomists. Deep learning classifiers have been successfully applied in various fields and provided encouraging results when used to categorize marine plankton images. Here, we present a set of deep learning models developed for the identification of lake plankton, and study several strategies to obtain optimal performances, which lead to operational prescriptions for users. To this aim, we annotated into 35 classes over 17900 images of zooplankton and large phytoplankton colonies, detected in Lake Greifensee (Switzerland) with the Dual Scripps Plankton Camera. Our best models were based on transfer learning and ensembling, which classified plankton images with 98% accuracy and 93% F1 score. When tested on freely available plankton datasets produced by other automated imaging tools (ZooScan, Imaging FlowCytobot, and ISIIS), our models performed better than previously used models. Our annotated data, code and classification models are freely available online.
浮游生物是淡水生境中环境变化和生态系统健康状况的有效指标,但使用手动显微镜方法收集浮游生物数据极为耗费人力且成本高昂。自动浮游生物成像为高频、准确地实时监测浮游生物群落提供了一条很有前景的途径。然而,对数以百万计的图像进行人工标注对分类学家来说是一项严峻的挑战。深度学习分类器已在各个领域成功应用,用于对海洋浮游生物图像进行分类时也取得了令人鼓舞的成果。在此,我们展示了一组为识别湖泊浮游生物而开发的深度学习模型,并研究了几种获得最佳性能的策略,从而为用户提供操作指南。为此,我们使用双斯克里普斯浮游生物相机,将在瑞士格赖芬湖检测到的17900多张浮游动物和大型浮游植物群落图像标注为35个类别。我们最好的模型基于迁移学习和集成,对浮游生物图像的分类准确率为98%,F1分数为93%。在由其他自动成像工具(ZooScan、成像流式细胞仪和ISIIS)生成的免费浮游生物数据集上进行测试时,我们的模型表现优于之前使用的模型。我们的标注数据、代码和分类模型均可在网上免费获取。