Department of Computer Engineering, Rochester Institute of Technology, Rochester, NY 14623, USA.
Sensors (Basel). 2021 Nov 11;21(22):7504. doi: 10.3390/s21227504.
We propose GourmetNet, a single-pass, end-to-end trainable network for food segmentation that achieves state-of-the-art performance. Food segmentation is an important problem as the first step for nutrition monitoring, food volume and calorie estimation. Our novel architecture incorporates both channel attention and spatial attention information in an expanded multi-scale feature representation using our advanced Waterfall Atrous Spatial Pooling module. GourmetNet refines the feature extraction process by merging features from multiple levels of the backbone through the two attention modules. The refined features are processed with the advanced multi-scale waterfall module that combines the benefits of cascade filtering and pyramid representations without requiring a separate decoder or post-processing. Our experiments on two food datasets show that GourmetNet significantly outperforms existing current state-of-the-art methods.
我们提出了 GourmetNet,这是一种用于食物分割的单步、端到端可训练的网络,可实现最先进的性能。食物分割是营养监测、食物量和卡路里估计的第一步,是一个重要的问题。我们的新架构在扩展的多尺度特征表示中结合了通道注意力和空间注意力信息,使用了我们先进的瀑布空洞空间池化模块。GourmetNet 通过两个注意力模块合并来自骨干网多个层次的特征来改进特征提取过程。经过改进的特征通过高级多尺度瀑布模块进行处理,该模块结合了级联滤波和金字塔表示的优点,而不需要单独的解码器或后处理。我们在两个食物数据集上的实验表明,GourmetNet 显著优于现有的最先进方法。