Australian Rivers Institute - Coast & Estuaries, and School of Environment and Science, Griffith University, Gold Coast, QLD, 4222, Australia.
Environ Monit Assess. 2020 Oct 12;192(11):698. doi: 10.1007/s10661-020-08653-z.
Environmental monitoring guides conservation and is particularly important for aquatic habitats which are heavily impacted by human activities. Underwater cameras and uncrewed devices monitor aquatic wildlife, but manual processing of footage is a significant bottleneck to rapid data processing and dissemination of results. Deep learning has emerged as a solution, but its ability to accurately detect animals across habitat types and locations is largely untested for coastal environments. Here, we produce five deep learning models using an object detection framework to detect an ecologically important fish, luderick (Girella tricuspidata). We trained two models on footage from single habitats (seagrass or reef) and three on footage from both habitats. All models were subjected to tests from both habitat types. Models performed well on test data from the same habitat type (object detection measure: mAP50: 91.7 and 86.9% performance for seagrass and reef, respectively) but poorly on test sets from a different habitat type (73.3 and 58.4%, respectively). The model trained on a combination of both habitats produced the highest object detection results for both tests (an average of 92.4 and 87.8%, respectively). The ability of the combination trained models to correctly estimate the ecological abundance metric, MaxN, showed similar patterns. The findings demonstrate that deep learning models extract ecologically useful information from video footage accurately and consistently and can perform across habitat types when trained on footage from the variety of habitat types.
环境监测指导保护工作,对于受到人类活动严重影响的水生栖息地尤其重要。水下摄像机和无人设备可监测水生野生动物,但对镜头的手动处理是快速数据处理和结果传播的一个重大瓶颈。深度学习已经成为一种解决方案,但它在沿海环境中准确检测不同栖息地类型和位置的动物的能力在很大程度上尚未得到验证。在这里,我们使用目标检测框架生成了五个深度学习模型来检测一种具有生态重要性的鱼类,卢德里奇(Girella tricuspidata)。我们在来自单一栖息地(海草或珊瑚礁)的镜头上训练了两个模型,在来自两个栖息地的镜头上训练了三个模型。所有模型都接受了来自两种栖息地类型的测试。在来自相同栖息地类型的测试数据上,模型表现良好(目标检测度量:mAP50 分别为 91.7%和 86.9%,适用于海草和珊瑚礁),但在来自不同栖息地类型的测试集中表现不佳(分别为 73.3%和 58.4%)。在组合了这两个栖息地的训练模型在两个测试中产生了最高的目标检测结果(平均分别为 92.4%和 87.8%)。组合训练模型正确估计生态丰富度度量 MaxN 的能力也表现出类似的模式。研究结果表明,深度学习模型可以从视频中准确而一致地提取出有生态价值的信息,并且在经过各种栖息地类型的镜头训练后,可以在不同的栖息地类型中进行性能表现。