Petroni L, Natucci L, Massolo A
Ethology Unit, Department of Biology University of Pisa Pisa Italy.
Faculty of Veterinary Medicine University of Calgary Calgary Alberta Canada.
Ecol Evol. 2024 Dec 12;14(12):e70544. doi: 10.1002/ece3.70544. eCollection 2024 Dec.
Camera trapping has become increasingly common in ecological studies, but is hindered by analyzing large datasets. Recently, artificial intelligence (deep learning models in particular) has emerged as a promising solution. However, applying deep learning for images processing is complex and often requires programming skills in Python, reducing its accessibility. Some authors addressed this issue with user-friendly software, and a further progress was the transposition of deep learning to R, a statistical language frequently used by ecologists, enhancing flexibility and customization of deep learning models without advanced computer expertise. We aimed to develop a user-friendly workflow based on R scripts to streamline the entire process, from selecting to classifying camera trap images. Our workflow integrates the MegaDetector object detector for labelling images and custom training of the state-of-the-art YOLOv8 model, together with potential for offline image augmentation to manage imbalanced datasets. Inference results are stored in a database compatible with Timelapse for quality checking of model predictions. We tested our workflow on images collected within a project targeting medium and large mammals of Central Italy, and obtained an overall precision of 0.962, a recall of 0.945, and a mean average precision of 0.913 for a training set of only 1000 pictures per species. Furthermore, the custom model achieved 91.8% of correct species-level classifications on a set of unclassified images, reaching 97.1% for those classified with > 90% confidence. YOLO, a fast and light deep learning architecture, enables application of the workflow even on resource-limited machines, and integration with image augmentation makes it useful even during early stages of data collection. All R scripts and pretrained models are available to enable adaptation of the workflow to other contexts, plus further development.
相机陷阱在生态学研究中已变得越来越普遍,但在分析大型数据集时却受到阻碍。最近,人工智能(尤其是深度学习模型)已成为一种很有前景的解决方案。然而,将深度学习应用于图像处理很复杂,通常需要具备Python编程技能,这降低了其可及性。一些作者通过用户友好型软件解决了这个问题,进一步的进展是将深度学习转换到R语言,这是生态学家经常使用的一种统计语言,无需高级计算机专业知识就能增强深度学习模型的灵活性和定制性。我们旨在开发一个基于R脚本的用户友好型工作流程,以简化从选择到分类相机陷阱图像的整个过程。我们的工作流程集成了用于图像标注的MegaDetector目标检测器和最先进的YOLOv8模型的定制训练,以及用于管理不平衡数据集的离线图像增强功能。推理结果存储在与Timelapse兼容的数据库中,用于模型预测的质量检查。我们在一个针对意大利中部大中型哺乳动物的项目中收集的图像上测试了我们的工作流程,对于每个物种仅1000张图片的训练集,获得了0.962的总体精度、0.945的召回率和0.913的平均精度。此外,定制模型在一组未分类图像上实现了91.8%的正确物种级分类,对于置信度>90%的分类图像,这一比例达到97.1%。YOLO是一种快速且轻量级的深度学习架构,即使在资源有限的机器上也能应用该工作流程,并且与图像增强的集成使其在数据收集的早期阶段也很有用。所有R脚本和预训练模型均可获取,以实现该工作流程适应其他情况并进一步开发。