Xu Mingle, Yoon Sook, Fuentes Alvaro, Yang Jucheng, Park Dong Sun
Department of Electronics Engineering, Jeonbuk National University, Jeonbuk, South Korea.
Core Research Institute of Intelligent Robots, Jeonbuk National University, Jeonbuk, South Korea.
Front Plant Sci. 2022 Feb 7;12:773142. doi: 10.3389/fpls.2021.773142. eCollection 2021.
Deep learning shows its advantages and potentials in plant disease recognition and has witnessed a profound development in recent years. To obtain a competing performance with a deep learning algorithm, enough amount of annotated data is requested but in the natural world, scarce or imbalanced data are common, and annotated data is expensive or hard to collect. Data augmentation, aiming to create variations for training data, has shown its power for this issue. But there are still two challenges: creating more desirable variations for scarce and imbalanced data, and designing a data augmentation to ease object detection and instance segmentation. First, current algorithms made variations only inside one specific class, but more desirable variations can further promote performance. To address this issue, we propose a novel data augmentation paradigm that can adapt variations from one class to another. In the novel paradigm, an image in the source domain is translated into the target domain, while the variations unrelated to the domain are maintained. For example, an image with a healthy tomato leaf is translated into a powdery mildew image but the variations of the healthy leaf are maintained and transferred into the powdery mildew class, such as types of tomato leaf, sizes, and viewpoints. Second, current data augmentation is suitable to promote the image classification model but may not be appropriate to alleviate object detection and instance segmentation model, mainly because the necessary annotations can not be obtained. In this study, we leverage a prior mask as input to tell the area we are interested in and reuse the original annotations. In this way, our proposed algorithm can be utilized to do the three tasks simultaneously. Further, We collect 1,258 images of tomato leaves with 1,429 instance segmentation annotations as there is more than one instance in one single image, including five diseases and healthy leaves. Extensive experimental results on the collected images validate that our new data augmentation algorithm makes useful variations and contributes to improving performance for diverse deep learning-based methods.
深度学习在植物病害识别中展现出其优势和潜力,并且近年来取得了长足的发展。为了与深度学习算法竞争性能,需要足够数量的标注数据,但在现实世界中,稀缺或不均衡的数据很常见,且标注数据昂贵或难以收集。数据增强旨在为训练数据创建变体,已在这个问题上展现出其作用。但仍存在两个挑战:为稀缺和不均衡的数据创建更理想的变体,以及设计一种数据增强方法来简化目标检测和实例分割。首先,当前算法仅在一个特定类别内创建变体,但更理想的变体可以进一步提升性能。为解决这个问题,我们提出一种新颖的数据增强范式,它可以使一个类别的变体适应到另一个类别。在这个新颖的范式中,源域中的一幅图像被转换到目标域,同时与域无关的变体得以保留。例如,一张带有健康番茄叶的图像被转换为一张白粉病图像,但健康叶的变体得以保留并转移到白粉病类别中,比如番茄叶的类型、大小和视角。其次,当前的数据增强适用于提升图像分类模型,但可能不适用于缓解目标检测和实例分割模型,主要是因为无法获得必要的标注。在本研究中,我们利用一个先验掩码作为输入来告知我们感兴趣的区域,并复用原始标注。通过这种方式,我们提出的算法可以同时用于这三项任务。此外,我们收集了1258张番茄叶图像,带有1429个实例分割标注,因为一张图像中存在不止一个实例,包括五种病害和健康叶片。在收集到的图像上进行的大量实验结果验证了我们新的数据增强算法产生了有用的变体,并有助于提高多种基于深度学习的方法的性能。