Hoffman Benjamin, Cusimano Maddie, Baglione Vittorio, Canestrari Daniela, Chevallier Damien, DeSantis Dominic L, Jeantet Lorène, Ladds Monique A, Maekawa Takuya, Mata-Silva Vicente, Moreno-González Víctor, Pagano Anthony M, Trapote Eva, Vainio Outi, Vehkaoja Antti, Yoda Ken, Zacarian Katherine, Friedlaender Ari
Earth Species Project, Berkeley, CA, USA.
University de León, León, Spain.
Mov Ecol. 2024 Dec 18;12(1):78. doi: 10.1186/s40462-024-00511-8.
Animal-borne sensors ('bio-loggers') can record a suite of kinematic and environmental data, which are used to elucidate animal ecophysiology and improve conservation efforts. Machine learning techniques are used for interpreting the large amounts of data recorded by bio-loggers, but there exists no common framework for comparing the different machine learning techniques in this domain. This makes it difficult to, for example, identify patterns in what works well for machine learning-based analysis of bio-logger data. It also makes it difficult to evaluate the effectiveness of novel methods developed by the machine learning community.
To address this, we present the Bio-logger Ethogram Benchmark (BEBE), a collection of datasets with behavioral annotations, as well as a modeling task and evaluation metrics. BEBE is to date the largest, most taxonomically diverse, publicly available benchmark of this type, and includes 1654 h of data collected from 149 individuals across nine taxa. Using BEBE, we compare the performance of deep and classical machine learning methods for identifying animal behaviors based on bio-logger data. As an example usage of BEBE, we test an approach based on self-supervised learning. To apply this approach to animal behavior classification, we adapt a deep neural network pre-trained with 700,000 h of data collected from human wrist-worn accelerometers.
We find that deep neural networks out-perform the classical machine learning methods we tested across all nine datasets in BEBE. We additionally find that the approach based on self-supervised learning out-performs the alternatives we tested, especially in settings when there is a low amount of training data available.
In light of these results, we are able to make concrete suggestions for designing studies that rely on machine learning to infer behavior from bio-logger data. Therefore, we expect that BEBE will be useful for making similar suggestions in the future, as additional hypotheses about machine learning techniques are tested. Datasets, models, and evaluation code are made publicly available at https://github.com/earthspecies/BEBE , to enable community use of BEBE.
动物携带式传感器(“生物记录器”)可以记录一系列运动学和环境数据,这些数据用于阐明动物的生态生理学并改进保护工作。机器学习技术被用于解读生物记录器记录的大量数据,但在这个领域中不存在用于比较不同机器学习技术的通用框架。这使得例如难以识别在基于机器学习的生物记录器数据分析中效果良好的模式。这也使得难以评估机器学习社区开发的新方法的有效性。
为了解决这个问题,我们提出了生物记录器行为图谱基准(BEBE),它是一个带有行为注释的数据集集合,以及一个建模任务和评估指标。BEBE是迄今为止此类最大、分类学上最多样化的公开可用基准,包括从九个分类群的149个个体收集的1654小时数据。使用BEBE,我们比较了基于生物记录器数据识别动物行为的深度和经典机器学习方法的性能。作为BEBE的一个示例用法,我们测试了一种基于自监督学习的方法。为了将此方法应用于动物行为分类,我们改编了一个深度神经网络,该网络使用从人类手腕佩戴的加速度计收集的70万小时数据进行预训练。
我们发现深度神经网络在BEBE的所有九个数据集中的表现均优于我们测试的经典机器学习方法。我们还发现基于自监督学习的方法优于我们测试的其他方法,特别是在可用训练数据量较少的情况下。
鉴于这些结果,我们能够为设计依赖机器学习从生物记录器数据推断行为的研究提出具体建议。因此,我们预计随着对机器学习技术的更多假设得到测试,BEBE将来将有助于提出类似的建议。数据集、模型和评估代码可在https://github.com/earthspecies/BEBE上公开获取,以方便社区使用BEBE。