Mustofa Sumaya, Ahad Md Taimur, Emon Yousuf Rayhan, Sarker Arpita
Daffodil International University, Savar, Dhaka 1340, Bangladesh.
Data Brief. 2024 Sep 10;57:110910. doi: 10.1016/j.dib.2024.110910. eCollection 2024 Dec.
Papaya is a popular vegetable and fruit in both developing and developed countries. Nonetheless, Bangladesh's agricultural landscape is significantly influenced by papaya cultivation. However, disease is a common impediment to papaya productivity, adversely affecting papaya quality and yield and leading to substantial economic losses for farmers. Research suggests that computer-aided disease diagnosis and machine learning (ML) models can improve papaya production by detecting and classifying diseases. In this line, a dataset of papaya is required to diagnose the disease. Moreover, like many other fruits, papaya disease may vary from country to country. Therefore, the country-based papaya disease dataset is required. In this study, a papaya dataset is collected from Dhaka, Bangladesh. This dataset contains 2159 original images from five classes, including the healthy control class and four papaya leaf diseases: Anthracnose, Bacterial Spot, Curl, and Ring spot. Besides the original images, the dataset contains 210 annotated data for each of the five classes. The dataset contains two types of data: the and the . The image will interest data scientists who apply disease detection through a convolutional neural network (CNN) and its variants. Furthermore, the annotated images, such as You Only Look Once (YOLO), U-Net, Mask R-CNN, and Single Shot Detection (SSD), will be helpful for semantic segmentation. Since firm-applicable AI devices and mobile and web applications are in demand, the dataset collected in this study will offer multiple options for integrating ML models into AI devices. In countries with weather and climate similar to Bangladesh, data scientists may use their dataset in that context.
木瓜在发展中国家和发达国家都是一种受欢迎的蔬菜和水果。尽管如此,孟加拉国的农业格局仍受到木瓜种植的显著影响。然而,病害是木瓜生产力的常见障碍,对木瓜的品质和产量产生不利影响,给农民造成重大经济损失。研究表明,计算机辅助病害诊断和机器学习(ML)模型可以通过检测和分类病害来提高木瓜产量。为此,需要一个木瓜数据集来诊断病害。此外,与许多其他水果一样,木瓜病害可能因国家而异。因此,需要基于国家的木瓜病害数据集。在本研究中,从孟加拉国达卡收集了一个木瓜数据集。该数据集包含来自五个类别的2159张原始图像,包括健康对照类和四种木瓜叶病害:炭疽病、细菌性叶斑病、卷曲病和环斑病。除了原始图像外,该数据集还包含五个类别中每个类别的210个注释数据。该数据集包含两种类型的数据: 和 。该图像将引起通过卷积神经网络(CNN)及其变体应用病害检测的数据科学家的兴趣。此外,注释图像,如You Only Look Once(YOLO)、U-Net、Mask R-CNN和单阶段检测(SSD),将有助于语义分割。由于对适用的人工智能设备以及移动和网络应用有需求,本研究收集的数据集将为将ML模型集成到人工智能设备中提供多种选择。在气候和天气与孟加拉国相似的国家,数据科学家可以在这种情况下使用他们的数据集。