基于数据高效型视觉Transformer 集成的自动化生态学分类新范式。

Ensembles of data-efficient vision transformers as a new paradigm for automated classification in ecology.

机构信息

Eawag, Überlandstrasse 133, 8600, Dübendorf, Switzerland.

WSL, Zürcherstrasse 111, 8903, Birmensdorf, Switzerland.

出版信息

Sci Rep. 2022 Nov 3;12(1):18590. doi: 10.1038/s41598-022-21910-0.

DOI:10.1038/s41598-022-21910-0

PMID:36329061

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9633651/

Abstract

Monitoring biodiversity is paramount to manage and protect natural resources. Collecting images of organisms over large temporal or spatial scales is a promising practice to monitor the biodiversity of natural ecosystems, providing large amounts of data with minimal interference with the environment. Deep learning models are currently used to automate classification of organisms into taxonomic units. However, imprecision in these classifiers introduces a measurement noise that is difficult to control and can significantly hinder the analysis and interpretation of data. We overcome this limitation through ensembles of Data-efficient image Transformers (DeiTs), which not only are easy to train and implement, but also significantly outperform the previous state of the art (SOTA). We validate our results on ten ecological imaging datasets of diverse origin, ranging from plankton to birds. On all the datasets, we achieve a new SOTA, with a reduction of the error with respect to the previous SOTA ranging from 29.35% to 100.00%, and often achieving performances very close to perfect classification. Ensembles of DeiTs perform better not because of superior single-model performances but rather due to smaller overlaps in the predictions by independent models and lower top-1 probabilities. This increases the benefit of ensembling, especially when using geometric averages to combine individual learners. While we only test our approach on biodiversity image datasets, our approach is generic and can be applied to any kind of images.

摘要

监测生物多样性对于管理和保护自然资源至关重要。在大的时间或空间尺度上收集生物图像是监测自然生态系统生物多样性的一种很有前途的做法，它提供了大量的数据，对环境的干扰最小。深度学习模型目前被用于自动将生物分类为分类单元。然而，这些分类器的不准确性引入了一种难以控制的测量噪声，这可能会严重阻碍数据的分析和解释。我们通过数据高效图像转换器（DeiTs）的集成克服了这一限制，这些集成不仅易于训练和实现，而且性能明显优于以前的最先进水平（SOTA）。我们在十个来自不同来源的生态成像数据集上验证了我们的结果，范围从浮游生物到鸟类。在所有数据集上，我们都达到了新的 SOTA，与之前的 SOTA 相比，误差减少了 29.35%到 100.00%，并且通常非常接近完美的分类性能。DeiTs 的集成表现更好，不是因为单个模型的性能优越，而是因为独立模型的预测重叠较小，并且 top-1 概率较低。这增加了集成的好处，特别是在使用几何平均值来组合单个学习者时。虽然我们仅在生物多样性图像数据集上测试我们的方法，但我们的方法是通用的，可以应用于任何类型的图像。