Suppr超能文献

论公民科学数据质量对海洋图像中基于深度学习的分类的影响。

On the impact of Citizen Science-derived data quality on deep learning based classification in marine images.

机构信息

Biodata Mining Group, Faculty of Technology, Bielefeld University, Bielefeld, Germany.

National Oceanography Centre, University of Southampton Waterfront Campus, Southampton, United Kingdom.

出版信息

PLoS One. 2019 Jun 12;14(6):e0218086. doi: 10.1371/journal.pone.0218086. eCollection 2019.

Abstract

The evaluation of large amounts of digital image data is of growing importance for biology, including for the exploration and monitoring of marine habitats. However, only a tiny percentage of the image data collected is evaluated by marine biologists who manually interpret and annotate the image contents, which can be slow and laborious. In order to overcome the bottleneck in image annotation, two strategies are increasingly proposed: "citizen science" and "machine learning". In this study, we investigated how the combination of citizen science, to detect objects, and machine learning, to classify megafauna, could be used to automate annotation of underwater images. For this purpose, multiple large data sets of citizen science annotations with different degrees of common errors and inaccuracies observed in citizen science data were simulated by modifying "gold standard" annotations done by an experienced marine biologist. The parameters of the simulation were determined on the basis of two citizen science experiments. It allowed us to analyze the relationship between the outcome of a citizen science study and the quality of the classifications of a deep learning megafauna classifier. The results show great potential for combining citizen science with machine learning, provided that the participants are informed precisely about the annotation protocol. Inaccuracies in the position of the annotation had the most substantial influence on the classification accuracy, whereas the size of the marking and false positive detections had a smaller influence.

摘要

大量数字图像数据的评估对于生物学越来越重要,包括对海洋栖息地的探索和监测。然而,只有一小部分图像数据被海洋生物学家进行评估,他们手动解释和注释图像内容,这可能既缓慢又费力。为了克服图像注释的瓶颈,越来越多地提出了两种策略:“公民科学”和“机器学习”。在这项研究中,我们研究了如何结合公民科学来检测物体,以及机器学习来对大型动物进行分类,从而实现水下图像的自动注释。为此,我们通过修改由经验丰富的海洋生物学家完成的“黄金标准”注释,模拟了具有不同程度常见错误和不准确的公民科学注释的多个大数据集。模拟的参数是基于两个公民科学实验确定的。这使我们能够分析公民科学研究的结果与深度学习大型动物分类器的分类质量之间的关系。结果表明,只要参与者准确了解注释协议,就可以极大地发挥公民科学与机器学习相结合的潜力。注释位置的不准确性对分类准确性的影响最大,而标记的大小和假阳性检测的影响较小。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fa67/6561570/93db5b0a6047/pone.0218086.g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验