Najafian Keyhan, Ghanbari Alireza, Sabet Kish Mahdi, Eramian Mark, Shirdel Gholam Hassan, Stavness Ian, Jin Lingling, Maleki Farhad
Department of Computer Science, University of Saskatchewan, Saskatoon, Saskatchewan, Canada.
Mathematics Department, Faculty of Sciences, University of Qom, Qom, Iran.
Plant Phenomics. 2023;5:0025. doi: 10.34133/plantphenomics.0025. Epub 2023 Feb 24.
Deep learning has shown potential in domains with large-scale annotated datasets. However, manual annotation is expensive, time-consuming, and tedious. Pixel-level annotations are particularly costly for semantic segmentation in images with dense irregular patterns of object instances, such as in plant images. In this work, we propose a method for developing high-performing deep learning models for semantic segmentation of such images utilizing little manual annotation. As a use case, we focus on wheat head segmentation. We synthesize a computationally annotated dataset-using a few annotated images, a short unannotated video clip of a wheat field, and several video clips with no wheat-to train a customized U-Net model. Considering the distribution shift between the synthesized and real images, we apply three domain adaptation steps to gradually bridge the domain gap. Only using two annotated images, we achieved a Dice score of 0.89 on the internal test set. When further evaluated on a diverse external dataset collected from 18 different domains across five countries, this model achieved a Dice score of 0.73. To expose the model to images from different growth stages and environmental conditions, we incorporated two annotated images from each of the 18 domains to further fine-tune the model. This increased the Dice score to 0.91. The result highlights the utility of the proposed approach in the absence of large-annotated datasets. Although our use case is wheat head segmentation, the proposed approach can be extended to other segmentation tasks with similar characteristics of irregularly repeating patterns of object instances.
深度学习在拥有大规模标注数据集的领域已展现出潜力。然而,人工标注成本高昂、耗时且乏味。对于具有密集不规则对象实例模式的图像(如植物图像)中的语义分割,像素级标注成本尤其高昂。在这项工作中,我们提出了一种方法,用于开发高性能深度学习模型,以利用少量人工标注对此类图像进行语义分割。作为一个应用案例,我们专注于小麦穗分割。我们使用少量标注图像、一段小麦田的短未标注视频片段以及几段无小麦的视频片段合成一个计算标注数据集,以训练一个定制的U-Net模型。考虑到合成图像与真实图像之间的分布差异,我们应用三个域适应步骤来逐步弥合域差距。仅使用两张标注图像,我们在内部测试集上的Dice分数达到了0.89。当在从五个国家的18个不同域收集的多样化外部数据集上进一步评估时,该模型的Dice分数达到了0.73。为使模型接触不同生长阶段和环境条件的图像,我们纳入了来自18个域中每个域的两张标注图像以进一步微调模型。这将Dice分数提高到了0.91。结果突出了所提出方法在缺乏大规模标注数据集情况下的实用性。尽管我们的应用案例是小麦穗分割,但所提出的方法可扩展到具有类似对象实例不规则重复模式特征的其他分割任务。