Department of Computer Engineering, Rochester Institute of Technology, Rochester, NY 14623, USA.
Sensors (Basel). 2019 Dec 5;19(24):5361. doi: 10.3390/s19245361.
We propose a new efficient architecture for semantic segmentation, based on a "Waterfall" Atrous Spatial Pooling architecture, that achieves a considerable accuracy increase while decreasing the number of network parameters and memory footprint. The proposed Waterfall architecture leverages the efficiency of progressive filtering in the cascade architecture while maintaining multiscale fields-of-view comparable to spatial pyramid configurations. Additionally, our method does not rely on a postprocessing stage with Conditional Random Fields, which further reduces complexity and required training time. We demonstrate that the Waterfall approach with a ResNet backbone is a robust and efficient architecture for semantic segmentation obtaining state-of-the-art results with significant reduction in the number of parameters for the Pascal VOC dataset and the Cityscapes dataset.
我们提出了一种新的基于“瀑布”空洞空间池化架构的高效语义分割架构,在减少网络参数数量和内存占用的同时,显著提高了准确性。所提出的瀑布架构利用级联架构中渐进式滤波的效率,同时保持与空间金字塔配置相当的多尺度视野。此外,我们的方法不依赖于具有条件随机场的后处理阶段,这进一步降低了复杂性和所需的训练时间。我们证明,基于 ResNet 骨干的瀑布方法是一种强大且高效的语义分割架构,在减少参数数量的同时,在 Pascal VOC 数据集和 Cityscapes 数据集上获得了最先进的结果。