College of Computer and Information Engineering (College of Artificial Intelligence), Nanjing Tech University, Nanjing, China.
Ocean College, Zhejiang University, Zhoushan, China.
J Phycol. 2023 Dec;59(6):1166-1178. doi: 10.1111/jpy.13390. Epub 2023 Nov 23.
Diatoms are a crucial component in the study of aquatic ecosystems and ancient environmental records. However, traditional methods for identifying diatoms, such as morphological taxonomy and molecular detection, are costly, are time consuming, and have limitations. To address these issues, we developed an extensive collection of diatom images, consisting of 7983 images from 160 genera and 1042 species, which we expanded to 49,843 through preprocessing, segmentation, and data augmentation. Our study compared the performance of different algorithms, including backbones, batch sizes, dynamic data augmentation, and static data augmentation on experimental results. We determined that the ResNet152 network outperformed other networks, producing the most accurate results with top-1 and top-5 accuracies of 85.97% and 95.26%, respectively, in identifying 1042 diatom species. Additionally, we propose a method that combines model prediction and cosine similarity to enhance the model's performance in low-probability predictions, achieving an 86.07% accuracy rate in diatom identification. Our research contributes significantly to the recognition and classification of diatom images and has potential applications in water quality assessment, ecological monitoring, and detecting changes in aquatic biodiversity.
硅藻是水生生态系统和古代环境记录研究的重要组成部分。然而,传统的硅藻鉴定方法,如形态分类学和分子检测,既昂贵又耗时,并且存在局限性。为了解决这些问题,我们开发了一个广泛的硅藻图像集合,其中包含 160 属和 1042 种的 7983 张图像,通过预处理、分割和数据增强,我们将其扩展到 49843 张。我们的研究比较了不同算法的性能,包括骨干网、批量大小、动态数据增强和静态数据增强在实验结果上的表现。我们确定 ResNet152 网络的表现优于其他网络,在识别 1042 种硅藻物种方面,其最高-1 和最高-5 准确率分别达到 85.97%和 95.26%。此外,我们提出了一种结合模型预测和余弦相似度的方法,以提高模型在低概率预测中的性能,在硅藻识别方面的准确率达到 86.07%。我们的研究对硅藻图像的识别和分类有重要贡献,并在水质评估、生态监测和检测水生生物多样性变化方面具有潜在的应用。