Rawat Shivangana, Chandra Akshay L, Desai Sai Vikas, Balasubramanian Vineeth N, Ninomiya Seishi, Guo Wei
Department of Computer Science and Engineering, Indian Institute of Technology, Hyderabad, India.
Department of Computer Science, University of Freiburg, Germany.
Plant Phenomics. 2022 Feb 24;2022:9795275. doi: 10.34133/2022/9795275. eCollection 2022.
Training deep learning models typically requires a huge amount of labeled data which is expensive to acquire, especially in dense prediction tasks such as semantic segmentation. Moreover, plant phenotyping datasets pose additional challenges of heavy occlusion and varied lighting conditions which makes annotations more time-consuming to obtain. Active learning helps in reducing the annotation cost by selecting samples for labeling which are most informative to the model, thus improving model performance with fewer annotations. Active learning for semantic segmentation has been well studied on datasets such as PASCAL VOC and Cityscapes. However, its effectiveness on plant datasets has not received much importance. To bridge this gap, we empirically study and benchmark the effectiveness of four uncertainty-based active learning strategies on three natural plant organ segmentation datasets. We also study their behaviour in response to variations in training configurations in terms of augmentations used, the scale of training images, active learning batch sizes, and train-validation set splits.
训练深度学习模型通常需要大量有标签的数据,而获取这些数据成本高昂,尤其是在诸如语义分割等密集预测任务中。此外,植物表型数据集还存在严重遮挡和光照条件多变等额外挑战,这使得获取注释更加耗时。主动学习通过选择对模型最具信息量的样本进行标注,有助于降低注释成本,从而用更少的注释提高模型性能。在诸如PASCAL VOC和Cityscapes等数据集上,针对语义分割的主动学习已经得到了充分研究。然而,其在植物数据集上的有效性尚未受到足够重视。为了弥补这一差距,我们通过实证研究并对三种自然植物器官分割数据集上的四种基于不确定性的主动学习策略的有效性进行了基准测试。我们还研究了它们在不同训练配置下的行为,这些配置包括所使用的增强方法、训练图像的尺度、主动学习批次大小以及训练-验证集划分。