迈向富含上下文信息的自动化生物多样性评估：从相机陷阱数据中获取人工智能驱动的见解。

Towards Context-Rich Automated Biodiversity Assessments: Deriving AI-Powered Insights from Camera Trap Data.

作者信息

Fergus Paul, Chalmers Carl, Matthews Naomi, Nixon Stuart, Burger André, Hartley Oliver, Sutherland Chris, Lambin Xavier, Longmore Steven, Wich Serge

机构信息

School of Computer Science and Mathematics, Liverpool John Moores University, James Parsons Building, Byrom Street, Liverpool L3 3AF, UK.

Chester Zoo, Upton-by-Chester, Chester CH2 IEU, UK.

出版信息

Sensors (Basel). 2024 Dec 19;24(24):8122. doi: 10.3390/s24248122.

DOI:10.3390/s24248122

PMID:39771857

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11679253/

Abstract

Camera traps offer enormous new opportunities in ecological studies, but current automated image analysis methods often lack the contextual richness needed to support impactful conservation outcomes. Integrating vision-language models into these workflows could address this gap by providing enhanced contextual understanding and enabling advanced queries across temporal and spatial dimensions. Here, we present an integrated approach that combines deep learning-based vision and language models to improve ecological reporting using data from camera traps. We introduce a two-stage system: YOLOv10-X to localise and classify species (mammals and birds) within images and a Phi-3.5-vision-instruct model to read YOLOv10-X bounding box labels to identify species, overcoming its limitation with hard-to-classify objects in images. Additionally, Phi-3.5 detects broader variables, such as vegetation type and time of day, providing rich ecological and environmental context to YOLO's species detection output. When combined, this output is processed by the model's natural language system to answer complex queries, and retrieval-augmented generation (RAG) is employed to enrich responses with external information, like species weight and IUCN status (information that cannot be obtained through direct visual analysis). Combined, this information is used to automatically generate structured reports, providing biodiversity stakeholders with deeper insights into, for example, species abundance, distribution, animal behaviour, and habitat selection. Our approach delivers contextually rich narratives that aid in wildlife management decisions. By providing contextually rich insights, our approach not only reduces manual effort but also supports timely decision making in conservation, potentially shifting efforts from reactive to proactive.

摘要

相机陷阱在生态研究中提供了巨大的新机遇，但当前的自动图像分析方法往往缺乏支持有影响力的保护成果所需的丰富背景信息。将视觉语言模型集成到这些工作流程中，可以通过提供增强的背景理解并实现跨时空维度的高级查询来弥补这一差距。在这里，我们提出了一种综合方法，将基于深度学习的视觉和语言模型相结合，利用相机陷阱的数据改进生态报告。我们引入了一个两阶段系统：YOLOv10-X用于在图像中定位和分类物种（哺乳动物和鸟类），以及Phi-3.5视觉指导模型来读取YOLOv10-X的边界框标签以识别物种，克服其在图像中对难以分类的物体的局限性。此外，Phi-3.5检测更广泛的变量，如植被类型和一天中的时间，为YOLO的物种检测输出提供丰富的生态和环境背景。当两者结合时，该输出由模型的自然语言系统处理以回答复杂查询，并采用检索增强生成（RAG）用外部信息丰富响应，如物种体重和IUCN状态（这些信息无法通过直接视觉分析获得）。综合这些信息用于自动生成结构化报告，为生物多样性利益相关者提供对物种丰度、分布、动物行为和栖息地选择等方面更深入的见解。我们的方法提供了背景丰富的叙述，有助于野生动物管理决策。通过提供背景丰富的见解，我们的方法不仅减少了人工工作量，还支持保护工作中的及时决策，有可能将工作从被动转向主动。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0102/11679253/0d7336476db5/sensors-24-08122-g0A1.jpg

相似文献

Towards Context-Rich Automated Biodiversity Assessments: Deriving AI-Powered Insights from Camera Trap Data.

Sensors (Basel). 2024 Dec 19;24(24):8122. doi: 10.3390/s24248122.

Large-scale and long-term wildlife research and monitoring using camera traps: a continental synthesis.

Biol Rev Camb Philos Soc. 2025 Apr;100(2):530-555. doi: 10.1111/brv.13152. Epub 2025 Jan 17.

Camera trap surveys of Atlantic Forest mammals: A data set for analyses considering imperfect detection (2004-2020).

Ecology. 2024 May;105(5):e4298. doi: 10.1002/ecy.4298. Epub 2024 Apr 12.

Estimating species richness and modelling habitat preferences of tropical forest mammals from camera trap data.

PLoS One. 2014 Jul 23;9(7):e103300. doi: 10.1371/journal.pone.0103300. eCollection 2014.

Combining camera trap surveys and IUCN range maps to improve knowledge of species distributions.

Conserv Biol. 2024 Jun;38(3):e14221. doi: 10.1111/cobi.14221. Epub 2024 Mar 3.

Brain tumor segmentation and detection in MRI using convolutional neural networks and VGG16.

Cancer Biomark. 2025 Mar;42(3):18758592241311184. doi: 10.1177/18758592241311184. Epub 2025 Apr 4.

Boosting biodiversity monitoring using smartphone-driven, rapidly accumulating community-sourced data.

Elife. 2024 Jun 20;13:RP93694. doi: 10.7554/eLife.93694.

Towards a standardized framework for AI-assisted, image-based monitoring of nocturnal insects.

Philos Trans R Soc Lond B Biol Sci. 2024 Jun 24;379(1904):20230108. doi: 10.1098/rstb.2023.0108. Epub 2024 May 6.

Detecting management gaps for biodiversity conservation: An integrated assessment.

J Environ Manage. 2024 Mar;354:120247. doi: 10.1016/j.jenvman.2024.120247. Epub 2024 Feb 16.

High resolution descriptors for UAV mapping in biodiversity conservation - A case study of sandy steppe habitat renewal.

PLoS One. 2025 Mar 13;20(3):e0315399. doi: 10.1371/journal.pone.0315399. eCollection 2025.

本文引用的文献

A survey on multimodal large language models.

Natl Sci Rev. 2024 Nov 12;11(12):nwae403. doi: 10.1093/nsr/nwae403. eCollection 2024 Dec.

Vision-Language Models for Vision Tasks: A Survey.

IEEE Trans Pattern Anal Mach Intell. 2024 Aug;46(8):5625-5644. doi: 10.1109/TPAMI.2024.3369699. Epub 2024 Jul 2.

Enhancing biodiversity conservation and monitoring in protected areas through efficient data management.

Environ Monit Assess. 2023 Dec 5;196(1):12. doi: 10.1007/s10661-023-11851-0.

A Comprehensive Overview of Technologies for Species and Habitat Monitoring and Conservation.

Bioscience. 2021 Jul 28;71(10):1038-1062. doi: 10.1093/biosci/biab073. eCollection 2021 Oct.

Open Science principles for accelerating trait-based science across the Tree of Life.

Nat Ecol Evol. 2020 Mar;4(3):294-303. doi: 10.1038/s41559-020-1109-6. Epub 2020 Feb 17.

Deep learning for environmental conservation.

Curr Biol. 2019 Oct 7;29(19):R977-R982. doi: 10.1016/j.cub.2019.08.016.

Snap happy: camera traps are an effective sampling tool when compared with alternative methods.

R Soc Open Sci. 2019 Mar 6;6(3):181748. doi: 10.1098/rsos.181748. eCollection 2019 Mar.

Object Detection With Deep Learning: A Review.

IEEE Trans Neural Netw Learn Syst. 2019 Nov;30(11):3212-3232. doi: 10.1109/TNNLS.2018.2876865. Epub 2019 Jan 28.

Software to facilitate and streamline camera trap data management: A review.

Ecol Evol. 2018 Sep 6;8(19):9947-9957. doi: 10.1002/ece3.4464. eCollection 2018 Oct.

Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks.

IEEE Trans Pattern Anal Mach Intell. 2017 Jun;39(6):1137-1149. doi: 10.1109/TPAMI.2016.2577031. Epub 2016 Jun 6.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

迈向富含上下文信息的自动化生物多样性评估：从相机陷阱数据中获取人工智能驱动的见解。

Towards Context-Rich Automated Biodiversity Assessments: Deriving AI-Powered Insights from Camera Trap Data.

作者信息

机构信息

出版信息

相似文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

本文引用的文献