GEOMAR Helmholtz-Center for Ocean Research Kiel, 24148 Kiel, Germany.
Christian-Albrechts University Kiel, Institute of Geosciences, 24118 Kiel, Germany.
Sci Data. 2018 Aug 28;5:180181. doi: 10.1038/sdata.2018.181.
Optical imaging is a common technique in ocean research. Diving robots, towed cameras, drop-cameras and TV-guided sampling gear: all produce image data of the underwater environment. Technological advances like 4K cameras, autonomous robots, high-capacity batteries and LED lighting now allow systematic optical monitoring at large spatial scale and shorter time but with increased data volume and velocity. Volume and velocity are further increased by growing fleets and emerging swarms of autonomous vehicles creating big data sets in parallel. This generates a need for automated data processing to harvest maximum information. Systematic data analysis benefits from calibrated, geo-referenced data with clear metadata description, particularly for machine vision and machine learning. Hence, the expensive data acquisition must be documented, data should be curated as soon as possible, backed up and made publicly available. Here, we present a workflow towards sustainable marine image analysis. We describe guidelines for data acquisition, curation and management and apply it to the use case of a multi-terabyte deep-sea data set acquired by an autonomous underwater vehicle.
光学成像技术是海洋研究中的一种常用技术。潜水机器人、拖曳式摄像机、空投式摄像机和电视引导式采样设备:所有这些设备都能生成水下环境的图像数据。4K 摄像机、自主机器人、高容量电池和 LED 照明等技术进步,使得可以在更大的空间尺度和更短的时间内进行系统的光学监测,但数据量和速度却有所增加。随着舰队的壮大和自主车辆群的涌现,数据集也在不断增大,这进一步增加了数据量和速度。这就需要自动化数据处理来获取最大的信息量。对于机器视觉和机器学习来说,经过校准、地理参考的数据和明确的元数据描述的系统数据分析将受益匪浅。因此,必须记录昂贵的数据采集,应尽快进行数据整理,并进行备份和公开。在这里,我们提出了一种可持续的海洋图像分析工作流程。我们描述了数据采集、整理和管理的指导方针,并将其应用于自主水下航行器获取的数百太字节深海数据集的用例。