Suppr超能文献

印度主要巴斯马蒂水稻种子品种图像数据集。

Indian major basmati paddy seed varieties images dataset.

作者信息

Sharma Arun, Satish Deepshikha, Sharma Sushmita, Gupta Dinesh

机构信息

International Centre for Genetic Engineering and Biotechnology, Aruna Asaf Ali Marg, New Delhi 110067, India.

出版信息

Data Brief. 2020 Oct 28;33:106460. doi: 10.1016/j.dib.2020.106460. eCollection 2020 Dec.

Abstract

The dataset contains images of 10 out of 32 notified Indian basmati seeds varieties (by the Government of India). Indian basmati paddy varieties included in the dataset are 1121, 1509, 1637, 1718, 1728, BAS-370, CSR 30, Type-3/Dehraduni Basmati, PB-1 and PB-6. Moreover, several images of other seeds and related entities available in the household have also been included in the dataset. Thus, the dataset contains 11 classes such that ten classes contain images from ten different basmati paddy varieties. In contrast, the 11th class- named "Unknown" contains images from a mixture of two morphologically similar paddy varieties (1121 and 1509), different pulses, other grains and related food entities. The Unknown class is useful in discriminating the paddy seeds from other types of seeds and related food entities. All the images were captured (in standard conditions) manually using an apparatus developed and a tablet with a five-megapixel camera (5MP). The camera was used to capture 3210 RGB coloured images in JPG format. The data pre-processing was performed to generate the ready-to-use images for training and testing machine learning-based models. AI-based paddy seed variety classification models have been developed using the dataset. The dataset can be used to generate different types of AI-based models for adulteration detection, automated classification models (along with independent devices) at the time of rice threshing, and to increase the classification potential (Supplementing images representing additional basmati varieties).

摘要

该数据集包含印度政府通报的32个巴斯马蒂水稻品种中的10个品种的图像。数据集中包含的印度巴斯马蒂水稻品种有1121、1509、1637、1718、1728、BAS - 370、CSR 30、3号/德拉敦尼巴斯马蒂、PB - 1和PB - 6。此外,数据集中还包含了家庭中可获得的其他种子及相关实体的若干图像。因此,该数据集包含11个类别,其中10个类别包含来自10个不同巴斯马蒂水稻品种的图像。相比之下,第11类名为“未知”,包含来自两个形态相似的水稻品种(1121和1509)、不同豆类、其他谷物及相关食品实体的混合图像。未知类别有助于区分水稻种子与其他类型的种子及相关食品实体。所有图像均在标准条件下使用自行开发的设备和配备500万像素摄像头(5MP)的平板电脑手动拍摄。该摄像头用于拍摄3210张JPG格式的RGB彩色图像。进行了数据预处理,以生成用于训练和测试基于机器学习的模型的可用图像。已使用该数据集开发了基于人工智能的水稻种子品种分类模型。该数据集可用于生成不同类型的基于人工智能的掺假检测模型、水稻脱粒时的自动分类模型(连同独立设备),并提高分类潜力(补充代表其他巴斯马蒂品种的图像)。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1f3e/7653079/5dc32fbec880/gr1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验