Suppr超能文献

从机器学习输出中推导占用率估计值的统计方法比较。

A comparison of statistical methods for deriving occupancy estimates from machine learning outputs.

作者信息

Katsis Lydia K D, Rhinehart Tessa A, Dorgay Elizabeth, Sanchez Emma E, Snaddon Jake L, Doncaster C Patrick, Kitzes Justin

机构信息

School of Geography and Environmental Science, University of Southampton, Southampton, UK.

Department of Biological Sciences, University of Pittsburgh, Pittsburgh, PA, USA.

出版信息

Sci Rep. 2025 Apr 27;15(1):14700. doi: 10.1038/s41598-025-95207-3.

Abstract

The combination of autonomous recording units (ARUs) and machine learning enables scalable biodiversity monitoring. These data are often analysed using occupancy models, yet methods for integrating machine learning outputs with these models are rarely compared. Using the Yucatán black howler monkey as a case study, we evaluated four approaches for integrating ARU data and machine learning outputs into occupancy models: (i) standard occupancy models with verified data, and false-positive occupancy models using (ii) presence-absence data, (iii) counts of detections, and (iv) continuous classifier scores. We assessed estimator accuracy and the effects of decision threshold, temporal subsampling, and verification strategies. We found that classifier-guided listening with a standard occupancy model provided an accurate estimate with minimal verification effort. The false-positive models yielded similarly accurate estimates under specific conditions, but were sensitive to subjective choices including decision threshold. The inability to determine stable parameter choices a priori, coupled with the increased computational complexity of several models (i.e. the detection-count and continuous-score models), limits the practical application of false-positive models. In the case of a high-performance classifier and a readily detectable species, classifier-guided listening paired with a standard occupancy model provides a practical and efficient approach for accurately estimating occupancy.

摘要

自主记录单元(ARU)与机器学习相结合,能够实现可扩展的生物多样性监测。这些数据通常使用占有率模型进行分析,然而,将机器学习输出与这些模型相结合的方法却很少被比较。以尤卡坦黑吼猴为例,我们评估了四种将ARU数据和机器学习输出整合到占有率模型中的方法:(i)使用经过验证的数据的标准占有率模型,以及使用(ii)存在-缺失数据、(iii)检测计数和(iv)连续分类器分数的假阳性占有率模型。我们评估了估计器的准确性以及决策阈值、时间子采样和验证策略的影响。我们发现,使用标准占有率模型进行分类器引导的监听,只需最少的验证工作就能提供准确的估计。在特定条件下,假阳性模型也能产生类似准确的估计,但对包括决策阈值在内的主观选择很敏感。无法事先确定稳定的参数选择,再加上几个模型(即检测计数模型和连续分数模型)计算复杂度的增加,限制了假阳性模型的实际应用。对于高性能分类器和易于检测的物种,分类器引导的监听与标准占有率模型相结合,为准确估计占有率提供了一种实用且高效的方法。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7f88/12034756/1c23277d9147/41598_2025_95207_Fig1_HTML.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验