Siam A K M Fazlul Kobir, Bishshash Prayma, Nirob Md Asraful Sharker, Mamun Sajib Bin, Assaduzzaman Md, Noori Sheak Rashed Haider
Department of CSE, Daffodil International University, Bangladesh.
Data Brief. 2024 Dec 19;58:111244. doi: 10.1016/j.dib.2024.111244. eCollection 2025 Feb.
A comprehensive dataset on lemon leaf disease can surely bring a lot of potentials into the development of agricultural research and the improvement of disease management strategies. This dataset was developed from 1354 raw images taken with professional agricultural specialist guidance from July to September 2024 in Charpolisha, Jamalpur, and further enhanced with augmented techniques, adding 9000 images. The augmentation process involves a set of techniques-flipping, rotation, zooming, shifting, adding noise, shearing, and brightening-to increase variety for different lemon leaf condition representations. Each of these images was standardized to 800 × 800 pixels resolution, so that consistency may be maintained among the dataset. All images were labelled in the nine prefixed categories: anthracnose, bacterial blight, citrus canker, curl virus, deficiency leaf, dry leaf, healthy leaf, sooty mould, and spider mites. In the present study, a DenseNet-121 architecture was used, where 20 % of the dataset was kept for validation and the remaining 80 % for training. A trained model with a batch size of 32 was trained for 30 epochs, achieving an accuracy of 98.56 % with augmentation, and 96.19 % without it. The dataset will not only act as a benchmark in developing accurate machine learning models for early disease detection, but it will also contribute to the cause of sustainable lemon cultivation practices by facilitating timely and effective disease management interventions
一个关于柠檬叶疾病的综合数据集肯定能为农业研究的发展和疾病管理策略的改进带来诸多潜力。该数据集由2024年7月至9月在贾马尔布尔的查波利沙,在专业农业专家指导下拍摄的1354张原始图像开发而成,并通过增强技术进一步扩充,新增了9000张图像。增强过程涉及一组技术——翻转、旋转、缩放、平移、添加噪声、剪切和提亮——以增加不同柠檬叶状况表示的多样性。这些图像每张都被标准化为800×800像素分辨率,以便在数据集中保持一致性。所有图像都被标记为九个前缀类别:炭疽病、细菌性疫病、柑橘溃疡病、卷曲病毒、缺素叶、枯叶、健康叶、煤烟病和红蜘蛛。在本研究中,使用了DenseNet - 121架构,其中20%的数据集用于验证,其余80%用于训练。一个批量大小为32的训练模型训练了30个轮次,增强情况下准确率达到98.56%,未增强时为96.19%。该数据集不仅将作为开发用于早期疾病检测的准确机器学习模型的基准,还将通过促进及时有效的疾病管理干预,为可持续柠檬种植实践事业做出贡献。