食谱1M+：用于学习烹饪食谱和食物图像跨模态嵌入的数据集。

Recipe1M+: A Dataset for Learning Cross-Modal Embeddings for Cooking Recipes and Food Images.

作者信息

Marin Javier, Biswas Aritro, Ofli Ferda, Hynes Nicholas, Salvador Amaia, Aytar Yusuf, Weber Ingmar, Torralba Antonio

出版信息

IEEE Trans Pattern Anal Mach Intell. 2019 Jul 9. doi: 10.1109/TPAMI.2019.2927476.

DOI:10.1109/TPAMI.2019.2927476

Abstract

In this paper, we introduce Recipe1M+, a new large-scale, structured corpus of over one million cooking recipes and 13 million food images. As the largest publicly available collection of recipe data, Recipe1M+ affords the ability to train high-capacity models on aligned, multi-modal data. Using these data, we train a neural network to learn a joint embedding of recipes and images that yields impressive results on an image-recipe retrieval task. Moreover, we demonstrate that regularization via the addition of a high-level classification objective both improves retrieval performance to rival that of humans and enables semantic vector arithmetic. We postulate that these embeddings will provide a basis for further exploration of the Recipe1M+ dataset and food and cooking in general. Code, data and models are publicly available.

摘要

在本文中，我们介绍了Recipe1M+，这是一个新的大规模结构化语料库，包含超过一百万个烹饪食谱和一千三百万张食物图片。作为最大的公开可用食谱数据集，Recipe1M+能够让我们在对齐的多模态数据上训练高容量模型。利用这些数据，我们训练了一个神经网络，以学习食谱和图像的联合嵌入，在图像-食谱检索任务中取得了令人印象深刻的结果。此外，我们证明通过添加高级分类目标进行正则化，既能提高检索性能以媲美人类，又能实现语义向量运算。我们推测这些嵌入将为进一步探索Recipe1M+数据集以及一般的食物和烹饪奠定基础。代码、数据和模型均可公开获取。

相似文献

Recipe1M+: A Dataset for Learning Cross-Modal Embeddings for Cooking Recipes and Food Images.

IEEE Trans Pattern Anal Mach Intell. 2019 Jul 9. doi: 10.1109/TPAMI.2019.2927476.

Learning Structural Representations for Recipe Generation and Food Retrieval.

IEEE Trans Pattern Anal Mach Intell. 2023 Mar;45(3):3363-3377. doi: 10.1109/TPAMI.2022.3181294. Epub 2023 Feb 3.

Disambiguity and Alignment: An Effective Multi-Modal Alignment Method for Cross-Modal Recipe Retrieval.

Foods. 2024 May 23;13(11):1628. doi: 10.3390/foods13111628.

Ki-Cook: clustering multimodal cooking representations through knowledge-infused learning.

Front Big Data. 2023 Jul 24;6:1200840. doi: 10.3389/fdata.2023.1200840. eCollection 2023.

Learning Multi-Modal Nonlinear Embeddings: Performance Bounds and an Algorithm.

IEEE Trans Image Process. 2021;30:4384-4394. doi: 10.1109/TIP.2021.3071688. Epub 2021 Apr 21.

Inclusion of Food Safety Information in Home-delivered U.K. Meal-kit Recipes.

J Food Prot. 2023 Nov;86(11):100162. doi: 10.1016/j.jfp.2023.100162. Epub 2023 Sep 14.

DelicacyNet for nutritional evaluation of recipes.

Front Nutr. 2023 Sep 14;10:1247631. doi: 10.3389/fnut.2023.1247631. eCollection 2023.

A comparison of word embeddings for the biomedical natural language processing.

J Biomed Inform. 2018 Nov;87:12-20. doi: 10.1016/j.jbi.2018.09.008. Epub 2018 Sep 12.

Deep Relation Embedding for Cross-Modal Retrieval.

IEEE Trans Image Process. 2021;30:617-627. doi: 10.1109/TIP.2020.3038354. Epub 2020 Dec 1.

Large Scale Visual Food Recognition.

IEEE Trans Pattern Anal Mach Intell. 2023 Aug;45(8):9932-9949. doi: 10.1109/TPAMI.2023.3237871. Epub 2023 Jun 30.

引用本文的文献

FoodSky: A food-oriented large language model that can pass the chef and dietetic examinations.

Patterns (N Y). 2025 Apr 22;6(5):101234. doi: 10.1016/j.patter.2025.101234. eCollection 2025 May 9.

Cross modal recipe retrieval with fine grained modal interaction.

Sci Rep. 2025 Feb 9;15(1):4842. doi: 10.1038/s41598-025-89461-8.

Adaptafood: an intelligent system to adapt recipes to specialised diets and healthy lifestyles.

Multimed Syst. 2025;31(1):87. doi: 10.1007/s00530-025-01667-y. Epub 2025 Feb 1.

Towards automated recipe genre classification using semi-supervised learning.

PLoS One. 2025 Jan 28;20(1):e0317697. doi: 10.1371/journal.pone.0317697. eCollection 2025.

Visual nutrition analysis: leveraging segmentation and regression for food nutrient estimation.

Front Nutr. 2024 Dec 17;11:1469878. doi: 10.3389/fnut.2024.1469878. eCollection 2024.

An Online Multimodal Food Data Exploration Platform for Specific Population Health: Development Study.

JMIR Form Res. 2024 Nov 15;8:e55088. doi: 10.2196/55088.

Computational gastronomy: capturing culinary creativity by making food computable.

NPJ Syst Biol Appl. 2024 Jul 8;10(1):72. doi: 10.1038/s41540-024-00399-5.

What's On the Menu? Towards Predicting Nutritional Quality of Food Environments.

medRxiv. 2023 Dec 10:2023.12.08.23299691. doi: 10.1101/2023.12.08.23299691.

Surveying Nutrient Assessment with Photographs of Meals (SNAPMe): A Benchmark Dataset of Food Photos for Dietary Assessment.

Nutrients. 2023 Nov 30;15(23):4972. doi: 10.3390/nu15234972.

An AI Dietitian for Type 2 Diabetes Mellitus Management Based on Large Language and Image Recognition Models: Preclinical Concept Validation Study.

J Med Internet Res. 2023 Nov 9;25:e51300. doi: 10.2196/51300.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

食谱1M+：用于学习烹饪食谱和食物图像跨模态嵌入的数据集。

Recipe1M+: A Dataset for Learning Cross-Modal Embeddings for Cooking Recipes and Food Images.

作者信息

出版信息

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献