Wiesner-Hanks Tyr, Wu Harvey, Stewart Ethan, DeChant Chad, Kaczmar Nicholas, Lipson Hod, Gore Michael A, Nelson Rebecca J
Plant Breeding and Genetics Section, School of Integrative Plant Science, Cornell University, Ithaca, NY, United States.
Department of Computer Science, Columbia University, New York, NY, United States.
Front Plant Sci. 2019 Dec 12;10:1550. doi: 10.3389/fpls.2019.01550. eCollection 2019.
Computer vision models that can recognize plant diseases in the field would be valuable tools for disease management and resistance breeding. Generating enough data to train these models is difficult, however, since only trained experts can accurately identify symptoms. In this study, we describe and implement a two-step method for generating a large amount of high-quality training data with minimal expert input. First, experts located symptoms of northern leaf blight (NLB) in field images taken by unmanned aerial vehicles (UAVs), annotating them quickly at low resolution. Second, non-experts were asked to draw polygons around the identified diseased areas, producing high-resolution ground truths that were automatically screened based on agreement between multiple workers. We then used these crowdsourced data to train a convolutional neural network (CNN), feeding the output into a conditional random field (CRF) to segment images into lesion and non-lesion regions with accuracy of 0.9979 and F1 score of 0.7153. The CNN trained on crowdsourced data showed greatly improved spatial resolution compared to one trained on expert-generated data, despite using only one fifth as many expert annotations. The final model was able to accurately delineate lesions down to the millimeter level from UAV-collected images, the finest scale of aerial plant disease detection achieved to date. The two-step approach to generating training data is a promising method to streamline deep learning approaches for plant disease detection, and for complex plant phenotyping tasks in general.
能够识别田间植物病害的计算机视觉模型将成为病害管理和抗性育种的宝贵工具。然而,生成足够的数据来训练这些模型很困难,因为只有训练有素的专家才能准确识别症状。在本研究中,我们描述并实施了一种两步法,以最少的专家投入生成大量高质量的训练数据。首先,专家在无人机拍摄的田间图像中定位玉米大斑病(NLB)的症状,并以低分辨率快速标注。其次,要求非专家在已识别的患病区域周围绘制多边形,生成高分辨率的真实数据,并根据多个工作人员之间的一致性进行自动筛选。然后,我们使用这些众包数据训练卷积神经网络(CNN),将输出输入条件随机场(CRF),以将图像分割为病斑和非病斑区域,准确率为0.9979,F1分数为0.7153。与使用专家生成的数据训练的CNN相比,使用众包数据训练的CNN显示出大大提高的空间分辨率,尽管专家注释的数量仅为前者的五分之一。最终模型能够从无人机收集的图像中精确地将病斑描绘到毫米级别,这是迄今为止空中植物病害检测达到的最精细尺度。生成训练数据的两步法是一种很有前途的方法,可简化用于植物病害检测以及一般复杂植物表型分析任务的深度学习方法。