Suppr超能文献

一个用于基于数据驱动的海岸环境分类的 12 亿像素人类标注数据集。

A 1.2 Billion Pixel Human-Labeled Dataset for Data-Driven Classification of Coastal Environments.

机构信息

Contractor, U.S. Geological Survey Pacific Coastal and Marine Science Center, Santa Cruz, CA, USA.

U.S. Geological Survey Pacific Coastal and Marine Science Center, Santa Cruz, CA, USA.

出版信息

Sci Data. 2023 Jan 20;10(1):46. doi: 10.1038/s41597-023-01929-2.

Abstract

The world's coastlines are spatially highly variable, coupled-human-natural systems that comprise a nested hierarchy of component landforms, ecosystems, and human interventions, each interacting over a range of space and time scales. Understanding and predicting coastline dynamics necessitates frequent observation from imaging sensors on remote sensing platforms. Machine Learning models that carry out supervised (i.e., human-guided) pixel-based classification, or image segmentation, have transformative applications in spatio-temporal mapping of dynamic environments, including transient coastal landforms, sediments, habitats, waterbodies, and water flows. However, these models require large and well-documented training and testing datasets consisting of labeled imagery. We describe "Coast Train," a multi-labeler dataset of orthomosaic and satellite images of coastal environments and corresponding labels. These data include imagery that are diverse in space and time, and contain 1.2 billion labeled pixels, representing over 3.6 million hectares. We use a human-in-the-loop tool especially designed for rapid and reproducible Earth surface image segmentation. Our approach permits image labeling by multiple labelers, in turn enabling quantification of pixel-level agreement over individual and collections of images.

摘要

世界海岸线是空间高度可变的、耦合人-自然的系统,由嵌套的地貌成分、生态系统和人类干预组成,每个成分在一定的时空范围内相互作用。理解和预测海岸线动态需要从遥感平台上的成像传感器进行频繁观测。执行监督(即人类指导)基于像素的分类或图像分割的机器学习模型,在包括瞬态海岸地貌、沉积物、栖息地、水体和水流在内的动态环境的时空映射中具有变革性的应用。然而,这些模型需要包含大量标记图像的大型且有充分记录的训练和测试数据集。我们描述了“海岸训练”,这是一个多标签的沿海环境正射影像和卫星图像以及相应标签的数据集。这些数据包括在空间和时间上多样化的图像,包含 12 亿个标记像素,代表超过 360 万公顷的面积。我们使用一种专门设计的、用于快速和可重复的地球表面图像分割的人机交互工具。我们的方法允许多个标注者对图像进行标注,从而能够量化单个和多个图像上的像素级一致性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/978a/9860036/a53caadddae9/41597_2023_1929_Fig1_HTML.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验