• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

使用变分多视图学习对杂货商品进行分类。

Using Variational Multi-view Learning for Classification of Grocery Items.

作者信息

Klasson Marcus, Zhang Cheng, Kjellström Hedvig

机构信息

Division of Robotics, Perception, and Learning, Lindstedtsvägen 24, 114 28 Stockholm, Sweden.

Microsoft Research Ltd, 21 Station Road, Cambridge CB1 2FB, UK.

出版信息

Patterns (N Y). 2020 Nov 13;1(8):100143. doi: 10.1016/j.patter.2020.100143.

DOI:10.1016/j.patter.2020.100143
PMID:33294874
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7691398/
Abstract

An essential task for computer vision-based assistive technologies is to help visually impaired people to recognize objects in constrained environments, for instance, recognizing food items in grocery stores. In this paper, we introduce a novel dataset with natural images of groceries-fruits, vegetables, and packaged products-where all images have been taken inside grocery stores to resemble a shopping scenario. Additionally, we download iconic images and text descriptions for each item that can be utilized for better representation learning of groceries. We select a multi-view generative model, which can combine the different item information into lower-dimensional representations. The experiments show that utilizing the additional information yields higher accuracies on classifying grocery items than only using the natural images. We observe that iconic images help to construct representations separated by visual differences of the items, while text descriptions enable the model to distinguish between visually similar items by different ingredients.

摘要

基于计算机视觉的辅助技术的一项重要任务是帮助视障人士在受限环境中识别物体,例如,在杂货店中识别食品。在本文中,我们引入了一个包含杂货店自然图像的新颖数据集,这些图像包括水果、蔬菜和包装产品,所有图像均在杂货店内部拍摄,以模拟购物场景。此外,我们为每个物品下载了标志性图像和文本描述,可用于更好地进行杂货店商品的表征学习。我们选择了一种多视图生成模型,它可以将不同的物品信息组合成低维表示。实验表明,利用这些额外信息在对杂货店商品进行分类时比仅使用自然图像具有更高的准确率。我们观察到,标志性图像有助于构建因物品视觉差异而分离的表示,而文本描述使模型能够通过不同的成分区分视觉上相似的物品。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7b29/7691398/956ab773277e/gr10.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7b29/7691398/c1de3d9f36a5/gr1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7b29/7691398/488b539c7988/gr2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7b29/7691398/4f6abb658046/gr3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7b29/7691398/7870bcf167e7/gr4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7b29/7691398/3ea19121baa9/gr5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7b29/7691398/046b56d0def9/gr6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7b29/7691398/50110dbca8d7/gr7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7b29/7691398/16e9e0b7a444/gr8.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7b29/7691398/f40792aee08e/gr9.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7b29/7691398/956ab773277e/gr10.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7b29/7691398/c1de3d9f36a5/gr1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7b29/7691398/488b539c7988/gr2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7b29/7691398/4f6abb658046/gr3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7b29/7691398/7870bcf167e7/gr4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7b29/7691398/3ea19121baa9/gr5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7b29/7691398/046b56d0def9/gr6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7b29/7691398/50110dbca8d7/gr7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7b29/7691398/16e9e0b7a444/gr8.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7b29/7691398/f40792aee08e/gr9.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7b29/7691398/956ab773277e/gr10.jpg

相似文献

1
Using Variational Multi-view Learning for Classification of Grocery Items.使用变分多视图学习对杂货商品进行分类。
Patterns (N Y). 2020 Nov 13;1(8):100143. doi: 10.1016/j.patter.2020.100143.
2
An accurate generation of image captions for blind people using extended convolutional atom neural network.使用扩展卷积原子神经网络为盲人准确生成图像字幕。
Multimed Tools Appl. 2023;82(3):3801-3830. doi: 10.1007/s11042-022-13443-5. Epub 2022 Jul 15.
3
The Six-Food Elimination Diet for Eosinophilic Esophagitis Increases Grocery Shopping Cost and Complexity.嗜酸性食管炎的六食物排除饮食增加了食品杂货购物成本和复杂性。
Dysphagia. 2016 Dec;31(6):765-770. doi: 10.1007/s00455-016-9739-1. Epub 2016 Aug 9.
4
Grocery Shopping How Individuals and Built Environments Influence Choice of Travel Mode.杂货店购物:个人与建成环境如何影响出行方式的选择
Transp Res Rec. 2011;2230:85-95. doi: 10.3141/2230-10.
5
Deep learning-based artificial vision for grasp classification in myoelectric hands.基于深度学习的人工视觉用于肌电手中的抓握分类
J Neural Eng. 2017 Jun;14(3):036025. doi: 10.1088/1741-2552/aa6802. Epub 2017 May 3.
6
A Multi-Modal Foundation Model to Assist People with Blindness and Low Vision in Environmental Interaction.一种多模态基础模型,用于协助失明和视力低下者进行环境交互。
J Imaging. 2024 Apr 26;10(5):103. doi: 10.3390/jimaging10050103.
7
Efficient Multi-Object Detection and Smart Navigation Using Artificial Intelligence for Visually Impaired People.利用人工智能实现视障人士的高效多目标检测与智能导航
Entropy (Basel). 2020 Aug 27;22(9):941. doi: 10.3390/e22090941.
8
[Access to grocery stores and nutrition/food intake observed in the National Health and Nutrition Survey: Focusing on the substitution-complementary relation].[在国民健康与营养调查中观察到的杂货店购物便利性与营养/食物摄入量:聚焦替代-互补关系]
Nihon Koshu Eisei Zasshi. 2020;67(4):261-271. doi: 10.11236/jph.67.4_261.
9
Development and Evaluation of a Nutritional Smartphone Application for Making Smart and Healthy Choices in Grocery Shopping.一款用于在杂货店购物时做出明智且健康选择的营养智能手机应用程序的开发与评估。
Healthc Inform Res. 2017 Jan;23(1):16-24. doi: 10.4258/hir.2017.23.1.16. Epub 2017 Jan 31.
10
Context-Aware REpresentation: Jointly Learning Item Features and Selection From Triplets.上下文感知表示:从三元组中联合学习项目特征与选择
IEEE Trans Neural Netw Learn Syst. 2025 Apr;36(4):6492-6502. doi: 10.1109/TNNLS.2024.3383246. Epub 2025 Apr 4.

引用本文的文献

1
Applications of knowledge graphs for food science and industry.知识图谱在食品科学与工业中的应用。
Patterns (N Y). 2022 May 13;3(5):100484. doi: 10.1016/j.patter.2022.100484.

本文引用的文献

1
Recipe1M+: A Dataset for Learning Cross-Modal Embeddings for Cooking Recipes and Food Images.食谱1M+:用于学习烹饪食谱和食物图像跨模态嵌入的数据集。
IEEE Trans Pattern Anal Mach Intell. 2019 Jul 9. doi: 10.1109/TPAMI.2019.2927476.
2
Advances in Variational Inference.变分推理的进展
IEEE Trans Pattern Anal Mach Intell. 2019 Aug;41(8):2008-2026. doi: 10.1109/TPAMI.2018.2889774. Epub 2018 Dec 25.
3
Zero-Shot Learning-A Comprehensive Evaluation of the Good, the Bad and the Ugly.零样本学习:好坏丑的全面评估。
IEEE Trans Pattern Anal Mach Intell. 2019 Sep;41(9):2251-2265. doi: 10.1109/TPAMI.2018.2857768. Epub 2018 Jul 19.
4
Multimodal Machine Learning: A Survey and Taxonomy.多模态机器学习:一项综述与分类法
IEEE Trans Pattern Anal Mach Intell. 2019 Feb;41(2):423-443. doi: 10.1109/TPAMI.2018.2798607. Epub 2018 Jan 25.
5
Long-Term Recurrent Convolutional Networks for Visual Recognition and Description.长期递归卷积网络的视觉识别与描述。
IEEE Trans Pattern Anal Mach Intell. 2017 Apr;39(4):677-691. doi: 10.1109/TPAMI.2016.2599174. Epub 2016 Sep 1.
6
DeepFruits: A Fruit Detection System Using Deep Neural Networks.深度水果:一种使用深度神经网络的水果检测系统。
Sensors (Basel). 2016 Aug 3;16(8):1222. doi: 10.3390/s16081222.
7
Deep Visual-Semantic Alignments for Generating Image Descriptions.深度视觉-语义对齐生成图像描述。
IEEE Trans Pattern Anal Mach Intell. 2017 Apr;39(4):664-676. doi: 10.1109/TPAMI.2016.2598339. Epub 2016 Aug 5.
8
Representation learning: a review and new perspectives.表示学习:综述与新视角。
IEEE Trans Pattern Anal Mach Intell. 2013 Aug;35(8):1798-828. doi: 10.1109/TPAMI.2013.50.
9
Babytalk: understanding and generating simple image descriptions.婴儿语:理解和生成简单的图像描述。
IEEE Trans Pattern Anal Mach Intell. 2013 Dec;35(12):2891-903. doi: 10.1109/TPAMI.2012.162.
10
Image quality assessment: from error visibility to structural similarity.图像质量评估:从误差可见性到结构相似性。
IEEE Trans Image Process. 2004 Apr;13(4):600-12. doi: 10.1109/tip.2003.819861.