Wang Yuemin, Stavness Ian
Department of Computer Science, University of Saskatchewan, Saskatoon, SK, Canada.
Front Artif Intell. 2025 Jan 9;7:1498956. doi: 10.3389/frai.2024.1498956. eCollection 2024.
Active learning can significantly decrease the labeling cost of deep learning workflows by prioritizing the limited labeling budget to high-impact data points that have the highest positive impact on model accuracy. Active learning is especially useful for semantic segmentation tasks where we can selectively label only a few high-impact regions within these high-impact images. Most established regional active learning algorithms deploy a static-budget querying strategy where a fixed percentage of regions are queried in each image. A static budget could result in over- or under-labeling images as the number of high-impact regions in each image can vary.
In this paper, we present a novel dynamic-budget superpixel querying strategy that can query the optimal numbers of high-uncertainty superpixels in an image to improve the querying efficiency of regional active learning algorithms designed for semantic segmentation.
For two distinct datasets, we show that by allowing a dynamic budget for each image, the active learning algorithm is more effective compared to static-budget querying at the same low total labeling budget. We investigate both low- and high-budget scenarios and the impact of superpixel size on our dynamic active learning scheme. In a low-budget scenario, our dynamic-budget querying outperforms static-budget querying by 5.6% mIoU on a specialized agriculture field image dataset and 2.4% mIoU on Cityscapes.
The presented dynamic-budget querying strategy is simple, effective, and can be easily adapted to other regional active learning algorithms to further improve the data efficiency of semantic segmentation tasks.
主动学习可以通过将有限的标注预算优先分配给对模型准确性有最大积极影响的高影响力数据点,从而显著降低深度学习工作流程的标注成本。主动学习对于语义分割任务特别有用,在这些任务中,我们可以仅选择性地标注这些高影响力图像中的少数高影响力区域。大多数已有的区域主动学习算法采用静态预算查询策略,即在每张图像中查询固定百分比的区域。由于每张图像中高影响力区域的数量可能不同,静态预算可能导致图像标注过多或过少。
在本文中,我们提出了一种新颖的动态预算超像素查询策略,该策略可以查询图像中高不确定性超像素的最佳数量,以提高为语义分割设计的区域主动学习算法的查询效率。
对于两个不同的数据集,我们表明,在相同的低总标注预算下,通过为每张图像允许动态预算,主动学习算法比静态预算查询更有效。我们研究了低预算和高预算场景以及超像素大小对我们的动态主动学习方案的影响。在低预算场景中,我们的动态预算查询在专门的农业领域图像数据集上比静态预算查询的平均交并比(mIoU)高5.6%,在Cityscapes数据集上高2.4%。
所提出的动态预算查询策略简单、有效,并且可以很容易地应用于其他区域主动学习算法,以进一步提高语义分割任务的数据效率。