Suppr超能文献

一种用于杂草分类高通量表型分析的新型基于云的自动化图像数据集。

A novel automated cloud-based image datasets for high throughput phenotyping in weed classification.

作者信息

G C Sunil, Koparan Cengiz, Upadhyay Arjun, Ahmed Mohammed Raju, Zhang Yu, Howatt Kirk, Sun Xin

机构信息

Department of Agricultural and Biosystems Engineering, North Dakota State University, Fargo, ND, United States.

Department of Plant Science, North Dakota State University, Fargo, ND, United States.

出版信息

Data Brief. 2024 Nov 1;57:111097. doi: 10.1016/j.dib.2024.111097. eCollection 2024 Dec.

Abstract

Deep learning-based weed detection data management involves data acquisition, data labeling, model development, and model evaluation phases. Out of these data management phases, data acquisition and data labeling are labor-intensive and time-consuming steps for building robust models. In addition, low temporal variation of crop and weed in the datasets is one of the limiting factors for effective weed detection model development. This article describes the cloud-based automatic data acquisition system (CADAS) to capture the weed and crop images in fixed time intervals to take plant growth stages into account for weed identification. The CADAS was developed by integrating fifteen digital cameras in the visible spectrum with gphoto2 libraries, external storage, cloud storage, and a computer with Linux operating system. Dataset from CADAS system contain six weed species and eight crop species for weed and crop detection. A dataset of 2000 images per weed and crop species was publicly released. Raw RGB images underwent a cropping process guided by bounding box annotations to generate individual JPG images for crop and weed instances. In addition to cropped image 200 raw images with label files were released publicly. This dataset hold potential for investigating challenges in deep learning-based weed and crop detection in agricultural settings. Additionally, this data could be used by researcher along with field data to boost the model performance by reducing data imbalance problem.

摘要

基于深度学习的杂草检测数据管理涉及数据采集、数据标注、模型开发和模型评估阶段。在这些数据管理阶段中,数据采集和数据标注是构建强大模型的劳动密集型且耗时的步骤。此外,数据集中作物和杂草的时间变化较小是有效杂草检测模型开发的限制因素之一。本文介绍了基于云的自动数据采集系统(CADAS),该系统以固定的时间间隔捕捉杂草和作物图像,以考虑植物生长阶段进行杂草识别。CADAS是通过将15台可见光谱的数码相机与gphoto2库、外部存储、云存储以及一台装有Linux操作系统的计算机集成而开发的。CADAS系统的数据集包含用于杂草和作物检测的6种杂草和8种作物。每种杂草和作物物种的2000张图像数据集已公开发布。原始RGB图像在边界框注释的指导下进行裁剪处理,以生成作物和杂草实例的单个JPG图像。除了裁剪后的图像外,还公开发布了200张带有标签文件的原始图像。该数据集对于研究农业环境中基于深度学习的杂草和作物检测的挑战具有潜力。此外,研究人员可以将这些数据与田间数据一起使用,通过减少数据不平衡问题来提高模型性能。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c7eb/11599996/208582661909/gr1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验