Laboratory for Analysis and Architecture of Systems (LAAS-CNRS), University of Toulouse, 31077 Toulouse, France.
RF Innovation, 20 Avenue Didier Daurat, 31400 Toulouse, France.
Sensors (Basel). 2024 Sep 19;24(18):6067. doi: 10.3390/s24186067.
Beehive health monitoring has gained interest in the study of bees in biology, ecology, and agriculture. As audio sensors are less intrusive, a number of audio datasets (mainly labeled with the presence of a queen in the hive) have appeared in the literature, and interest in their classification has been raised. All studies have exhibited good accuracy, and a few have questioned and revealed that classification cannot be generalized to unseen hives. To increase the number of known hives, a review of open datasets is described, and a merger in the form of the "BeeTogether" dataset on the open Kaggle platform is proposed. This common framework standardizes the data format and features while providing data augmentation techniques and a methodology for measuring hives' extrapolation properties. A classical classifier is proposed to benchmark the whole dataset, achieving the same good accuracy and poor hive generalization as those found in the literature. Insight into the role of the frequency of the classification of the presence of a queen is provided, and it is shown that this frequency mostly depends on a colony's belonging. New classifiers inspired by contrastive learning are introduced to circumvent the effect of colony belonging and obtain both good accuracy and hive extrapolation abilities when learning changes in labels. A process for obtaining absolute labels was prototyped on an unsupervised dataset. Solving hive extrapolation with a common open platform and contrastive approach can result in effective applications in agriculture.
蜂箱健康监测在生物学、生态学和农业领域的蜜蜂研究中受到关注。由于音频传感器的侵入性较小,因此出现了许多音频数据集(主要标记了蜂箱中蜂王的存在),并且对其分类的兴趣也有所提高。所有研究都表现出了很好的准确性,有一些研究对分类不能推广到未见过的蜂箱提出了质疑和揭示。为了增加已知蜂箱的数量,描述了对开放数据集的回顾,并在开放的 Kaggle 平台上以“BeeTogether”数据集的形式提出了合并。这个通用框架标准化了数据格式和特征,同时提供了数据增强技术和测量蜂箱外推特性的方法。提出了一种经典分类器来对整个数据集进行基准测试,其准确性与文献中发现的准确性和较差的蜂箱泛化能力相同。深入了解分类存在蜂王的频率的作用,并表明该频率主要取决于蜂群的归属。引入了受对比学习启发的新分类器,以规避蜂群归属的影响,并在学习标签变化时获得良好的准确性和蜂箱外推能力。在无监督数据集上对获取绝对标签的过程进行了原型设计。使用通用的开放平台和对比方法解决蜂箱外推问题,可以在农业中实现有效的应用。