IEEE Trans Pattern Anal Mach Intell. 2022 Jun;44(6):3139-3153. doi: 10.1109/TPAMI.2020.3045882. Epub 2022 May 5.
We address the problem of semantic nighttime image segmentation and improve the state-of-the-art, by adapting daytime models to nighttime without using nighttime annotations. Moreover, we design a new evaluation framework to address the substantial uncertainty of semantics in nighttime images. Our central contributions are: 1) a curriculum framework to gradually adapt semantic segmentation models from day to night through progressively darker times of day, exploiting cross-time-of-day correspondences between daytime images from a reference map and dark images to guide the label inference in the dark domains; 2) a novel uncertainty-aware annotation and evaluation framework and metric for semantic segmentation, including image regions beyond human recognition capability in the evaluation in a principled fashion; 3) the Dark Zurich dataset, comprising 2416 unlabeled nighttime and 2920 unlabeled twilight images with correspondences to their daytime counterparts plus a set of 201 nighttime images with fine pixel-level annotations created with our protocol, which serves as a first benchmark for our novel evaluation. Experiments show that our map-guided curriculum adaptation significantly outperforms state-of-the-art methods on nighttime sets both for standard metrics and our uncertainty-aware metric. Furthermore, our uncertainty-aware evaluation reveals that selective invalidation of predictions can improve results on data with ambiguous content such as our benchmark and profit safety-oriented applications involving invalid inputs.
我们通过适应白天模型到夜间而不使用夜间注释来解决语义夜间图像分割问题,并改进了现有技术。此外,我们设计了一个新的评估框架来解决夜间图像中语义的巨大不确定性。我们的主要贡献是:1)一个课程框架,通过白天图像的跨时间对应关系,从白天逐渐适应语义分割模型到夜间,从参考图中的白天图像和暗图像引导暗域中的标签推断;2)一种新的不确定性感知注释和评估框架和语义分割指标,包括以一种有原则的方式评估中超出人类识别能力的图像区域;3)Dark Zurich 数据集,包括 2416 张未标记的夜间图像和 2920 张未标记的黄昏图像,与它们的白天对应图像相对应,加上一组 201 张夜间图像,这些图像具有我们的协议创建的精细像素级注释,这是我们新评估的第一个基准。实验表明,我们的地图引导课程自适应在夜间集上的标准指标和我们的不确定性感知指标上都显著优于现有技术方法。此外,我们的不确定性感知评估表明,有选择地使预测无效可以提高我们基准测试等具有模糊内容的数据的结果,并有利于涉及无效输入的安全导向应用。