• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

美食网:使用多尺度瀑布特征与空间和通道注意力的食品分割。

GourmetNet: Food Segmentation Using Multi-Scale Waterfall Features with Spatial and Channel Attention.

机构信息

Department of Computer Engineering, Rochester Institute of Technology, Rochester, NY 14623, USA.

出版信息

Sensors (Basel). 2021 Nov 11;21(22):7504. doi: 10.3390/s21227504.

DOI:10.3390/s21227504
PMID:34833577
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8624046/
Abstract

We propose GourmetNet, a single-pass, end-to-end trainable network for food segmentation that achieves state-of-the-art performance. Food segmentation is an important problem as the first step for nutrition monitoring, food volume and calorie estimation. Our novel architecture incorporates both channel attention and spatial attention information in an expanded multi-scale feature representation using our advanced Waterfall Atrous Spatial Pooling module. GourmetNet refines the feature extraction process by merging features from multiple levels of the backbone through the two attention modules. The refined features are processed with the advanced multi-scale waterfall module that combines the benefits of cascade filtering and pyramid representations without requiring a separate decoder or post-processing. Our experiments on two food datasets show that GourmetNet significantly outperforms existing current state-of-the-art methods.

摘要

我们提出了 GourmetNet,这是一种用于食物分割的单步、端到端可训练的网络,可实现最先进的性能。食物分割是营养监测、食物量和卡路里估计的第一步,是一个重要的问题。我们的新架构在扩展的多尺度特征表示中结合了通道注意力和空间注意力信息,使用了我们先进的瀑布空洞空间池化模块。GourmetNet 通过两个注意力模块合并来自骨干网多个层次的特征来改进特征提取过程。经过改进的特征通过高级多尺度瀑布模块进行处理,该模块结合了级联滤波和金字塔表示的优点,而不需要单独的解码器或后处理。我们在两个食物数据集上的实验表明,GourmetNet 显著优于现有的最先进方法。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e71f/8624046/77e5473e4d47/sensors-21-07504-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e71f/8624046/960f5b1062d7/sensors-21-07504-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e71f/8624046/137c8ce22eec/sensors-21-07504-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e71f/8624046/869986486adc/sensors-21-07504-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e71f/8624046/43e19d008fa9/sensors-21-07504-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e71f/8624046/3c51b258a97f/sensors-21-07504-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e71f/8624046/802b03ac4aaf/sensors-21-07504-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e71f/8624046/77e5473e4d47/sensors-21-07504-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e71f/8624046/960f5b1062d7/sensors-21-07504-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e71f/8624046/137c8ce22eec/sensors-21-07504-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e71f/8624046/869986486adc/sensors-21-07504-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e71f/8624046/43e19d008fa9/sensors-21-07504-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e71f/8624046/3c51b258a97f/sensors-21-07504-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e71f/8624046/802b03ac4aaf/sensors-21-07504-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e71f/8624046/77e5473e4d47/sensors-21-07504-g007.jpg

相似文献

1
GourmetNet: Food Segmentation Using Multi-Scale Waterfall Features with Spatial and Channel Attention.美食网:使用多尺度瀑布特征与空间和通道注意力的食品分割。
Sensors (Basel). 2021 Nov 11;21(22):7504. doi: 10.3390/s21227504.
2
Fusion network based on the dual attention mechanism and atrous spatial pyramid pooling for automatic segmentation in retinal vessel images.基于双注意力机制和空洞空间金字塔池化的融合网络,用于视网膜血管图像的自动分割。
J Opt Soc Am A Opt Image Sci Vis. 2022 Aug 1;39(8):1393-1402. doi: 10.1364/JOSAA.459912.
3
A multiple-channel and atrous convolution network for ultrasound image segmentation.一种用于超声图像分割的多通道多孔卷积网络。
Med Phys. 2020 Dec;47(12):6270-6285. doi: 10.1002/mp.14512. Epub 2020 Oct 18.
4
[Lung parenchyma segmentation based on double scale parallel attention network].基于双尺度并行注意力网络的肺实质分割
Sheng Wu Yi Xue Gong Cheng Xue Za Zhi. 2022 Aug 25;39(4):721-729. doi: 10.7507/1001-5515.202108005.
5
Waterfall Atrous Spatial Pooling Architecture for Efficient Semantic Segmentation.用于高效语义分割的瀑布空洞空间池化架构。
Sensors (Basel). 2019 Dec 5;19(24):5361. doi: 10.3390/s19245361.
6
Full-BAPose: Bottom Up Framework for Full Body Pose Estimation.全 BAPose:用于全身姿势估计的自底向上框架。
Sensors (Basel). 2023 Apr 4;23(7):3725. doi: 10.3390/s23073725.
7
Boundary-aware context neural network for medical image segmentation.边界感知上下文神经网络在医学图像分割中的应用。
Med Image Anal. 2022 May;78:102395. doi: 10.1016/j.media.2022.102395. Epub 2022 Feb 14.
8
Polyp segmentation network with hybrid channel-spatial attention and pyramid global context guided feature fusion.具有混合通道-空间注意力和金字塔全局上下文引导特征融合的息肉分割网络。
Comput Med Imaging Graph. 2022 Jun;98:102072. doi: 10.1016/j.compmedimag.2022.102072. Epub 2022 May 11.
9
Collaborative networks of transformers and convolutional neural networks are powerful and versatile learners for accurate 3D medical image segmentation.基于转换器和卷积神经网络的协作网络是精确的 3D 医学图像分割的强大且多功能的学习者。
Comput Biol Med. 2023 Sep;164:107228. doi: 10.1016/j.compbiomed.2023.107228. Epub 2023 Jul 5.
10
Automatic liver tumor segmentation used the cascade multi-scale attention architecture method based on 3D U-Net.自动肝脏肿瘤分割采用了基于3D U-Net的级联多尺度注意力架构方法。
Int J Comput Assist Radiol Surg. 2022 Oct;17(10):1915-1922. doi: 10.1007/s11548-022-02653-9. Epub 2022 Jun 8.

引用本文的文献

1
Lightweight DeepLabv3+ for Semantic Food Segmentation.用于语义食品分割的轻量级深度可分离卷积网络v3+
Foods. 2025 Apr 9;14(8):1306. doi: 10.3390/foods14081306.
2
IngredSAM: Open-World Food Ingredient Segmentation via a Single Image Prompt.IngredSAM:通过单一图像提示实现开放世界食品成分分割
J Imaging. 2024 Nov 26;10(12):305. doi: 10.3390/jimaging10120305.
3
Nutritional composition analysis in food images: an innovative Swin Transformer approach.食品图像中的营养成分分析:一种创新的Swin Transformer方法。

本文引用的文献

1
Automated food intake tracking requires depth-refined semantic segmentation to rectify visual-volume discordance in long-term care homes.自动化饮食摄入跟踪需要深度细化的语义分割来纠正长期护理院中的视觉-容积不匹配。
Sci Rep. 2022 Jan 7;12(1):83. doi: 10.1038/s41598-021-03972-8.
2
Waterfall Atrous Spatial Pooling Architecture for Efficient Semantic Segmentation.用于高效语义分割的瀑布空洞空间池化架构。
Sensors (Basel). 2019 Dec 5;19(24):5361. doi: 10.3390/s19245361.
3
Recipe1M+: A Dataset for Learning Cross-Modal Embeddings for Cooking Recipes and Food Images.
Front Nutr. 2024 Oct 14;11:1454466. doi: 10.3389/fnut.2024.1454466. eCollection 2024.
4
mid-DeepLabv3+: A Novel Approach for Image Semantic Segmentation Applied to African Food Dietary Assessments.中深达实验室 v3+:一种应用于非洲食物膳食评估的图像语义分割新方法。
Sensors (Basel). 2023 Dec 29;24(1):209. doi: 10.3390/s24010209.
5
A New CNN-Based Single-Ingredient Classification Model and Its Application in Food Image Segmentation.一种基于卷积神经网络的新型单一成分分类模型及其在食品图像分割中的应用。
J Imaging. 2023 Sep 29;9(10):205. doi: 10.3390/jimaging9100205.
食谱1M+:用于学习烹饪食谱和食物图像跨模态嵌入的数据集。
IEEE Trans Pattern Anal Mach Intell. 2019 Jul 9. doi: 10.1109/TPAMI.2019.2927476.
4
Image Segmentation for Image-Based Dietary Assessment: A Comparative Study.基于图像的饮食评估中的图像分割:一项比较研究。
ISSCS 2013 (2013). 2013 Jul;2013. doi: 10.1109/ISSCS.2013.6651268. Epub 2013 Oct 31.
5
DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs.DeepLab:基于深度卷积网络、空洞卷积和全连接条件随机场的语义图像分割。
IEEE Trans Pattern Anal Mach Intell. 2018 Apr;40(4):834-848. doi: 10.1109/TPAMI.2017.2699184. Epub 2017 Apr 27.
6
Food Recognition: A New Dataset, Experiments, and Results.食物识别:新数据集、实验与结果。
IEEE J Biomed Health Inform. 2017 May;21(3):588-598. doi: 10.1109/JBHI.2016.2636441. Epub 2016 Dec 7.
7
SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation.SegNet:一种用于图像分割的深度卷积编解码器架构。
IEEE Trans Pattern Anal Mach Intell. 2017 Dec;39(12):2481-2495. doi: 10.1109/TPAMI.2016.2644615. Epub 2017 Jan 2.
8
Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition.空间金字塔池化在深度卷积网络中的视觉识别。
IEEE Trans Pattern Anal Mach Intell. 2015 Sep;37(9):1904-16. doi: 10.1109/TPAMI.2015.2389824.