College of Plant Protection, South China Agricultural University, Guangzhou, China.
College of Mathematics and Informatics, South China Agricultural University, Guangzhou, China.
J Sci Food Agric. 2024 Oct;104(13):8070-8078. doi: 10.1002/jsfa.13636. Epub 2024 Jun 15.
With the rapid development of deep learning, the recognition of rice disease images using deep neural networks has become a hot research topic. However, most previous studies only focus on the modification of deep learning models, while lacking research to systematically and scientifically explore the impact of different data sizes on the image recognition task for rice diseases. In this study, a functional model was developed to predict the relationship between the size of dataset and the accuracy rate of model recognition.
Training VGG16 deep learning models with different quantities of images of rice blast-diseased leaves and healthy rice leaves, it was found that the test accuracy of the resulting models could be well fitted with an exponential model (A = 0.9965 - e). Experimental results showed that with an increase of image quantity, the recognition accuracy of deep learning models would show a rapid increase at first. Yet when the image quantity increases beyond a certain threshold, the accuracy of image classification would not improve much, and the marginal benefit would be reduced. This trend remained similar when the composition of the dataset was changed, no matter whether (i) the disease class was changed, (ii) the number of classes was increased or (iii) the image data were augmented.
This study provided a scientific basis for the impact of data size on the accuracy of rice disease image recognition, and may also serve as a reference for researchers for database construction. © 2024 Society of Chemical Industry.
随着深度学习的快速发展,利用深度神经网络识别水稻病害图像已成为研究热点。然而,大多数先前的研究仅关注于对深度学习模型的改进,而缺乏对不同数据量对水稻病害图像识别任务的影响进行系统和科学的研究。本研究开发了一个功能模型,以预测数据集大小与模型识别准确率之间的关系。
使用不同数量的稻瘟病叶片和健康稻叶片图像训练 VGG16 深度学习模型,发现得到的模型的测试准确率可以很好地拟合指数模型(A=0.9965-e)。实验结果表明,随着图像数量的增加,深度学习模型的识别准确率起初会迅速提高。然而,当图像数量增加到一定阈值以上时,图像分类的准确率不会提高很多,边际效益会降低。当数据集的组成发生变化时,这种趋势仍然相似,无论(i)疾病类别是否改变,(ii)类别数量是否增加,或(iii)图像数据是否扩充。
本研究为数据大小对水稻病害图像识别准确率的影响提供了科学依据,也可为研究人员构建数据库提供参考。 © 2024 化学工业协会。