Holan Katerina L, White Charles H, Whitham Steven A
Department of Plant Pathology, Entomology, and Microbiology, Iowa State University, Ames, IA 50014.
Cooperative Institute for Research in the Atmosphere, Colorado State University, Fort Collins, CO 80523.
Phytopathology. 2024 May;114(5):990-999. doi: 10.1094/PHYTO-09-23-0313-KC. Epub 2024 Apr 22.
Computer vision approaches to analyze plant disease data can be both faster and more reliable than traditional, manual methods. However, the requirement of manually annotating training data for the majority of machine learning applications can present a challenge for pipeline development. Here, we describe a machine learning approach to quantify incidence on maize leaves utilizing U-Net convolutional neural network models. We analyzed several U-Net models with increasing amounts of training image data, either randomly chosen from a large data pool or randomly chosen from a subset of disease time course data. As the training dataset size increases, the models perform better, but the rate of performance decreases. Additionally, the use of a diverse training dataset can improve model performance and reduce the amount of annotated training data required for satisfactory performance. Models with as few as 48 whole-leaf training images are able to replicate the ground truth results within our testing dataset. The final model utilizing our entire training dataset performs similarly to our ground truth data, with an intersection over union value of 0.5002 and an F1 score of 0.6669. This work illustrates the capacity of U-Nets to accurately answer real-world plant pathology questions related to quantification and estimation of plant disease symptoms. [Formula: see text] Copyright © 2024 The Author(s). This is an open access article distributed under the CC BY-NC-ND 4.0 International license.
与传统的手动方法相比,用于分析植物病害数据的计算机视觉方法可以更快且更可靠。然而,对于大多数机器学习应用来说,手动标注训练数据的要求可能会给流程开发带来挑战。在此,我们描述一种利用U-Net卷积神经网络模型来量化玉米叶片发病率的机器学习方法。我们分析了几种随着训练图像数据量增加的U-Net模型,这些数据要么是从一个大数据池中随机选取的,要么是从病害时间进程数据的子集中随机选取的。随着训练数据集规模的增加,模型表现得更好,但性能提升速率下降。此外,使用多样化的训练数据集可以提高模型性能,并减少获得满意性能所需的标注训练数据量。仅有48张全叶训练图像的模型就能在我们的测试数据集中重现真实结果。利用我们整个训练数据集的最终模型表现与我们的真实数据相似,其交并比为0.5002,F1分数为0.6669。这项工作说明了U-Net能够准确回答与植物病害症状量化和估计相关的实际植物病理学问题。[公式:见正文] 版权所有© 2024作者。本文是一篇根据知识共享署名-非商业性使用-禁止演绎4.0国际许可协议分发的开放获取文章。