Suppr超能文献

食谱1M+:用于学习烹饪食谱和食物图像跨模态嵌入的数据集。

Recipe1M+: A Dataset for Learning Cross-Modal Embeddings for Cooking Recipes and Food Images.

作者信息

Marin Javier, Biswas Aritro, Ofli Ferda, Hynes Nicholas, Salvador Amaia, Aytar Yusuf, Weber Ingmar, Torralba Antonio

出版信息

IEEE Trans Pattern Anal Mach Intell. 2019 Jul 9. doi: 10.1109/TPAMI.2019.2927476.

Abstract

In this paper, we introduce Recipe1M+, a new large-scale, structured corpus of over one million cooking recipes and 13 million food images. As the largest publicly available collection of recipe data, Recipe1M+ affords the ability to train high-capacity models on aligned, multi-modal data. Using these data, we train a neural network to learn a joint embedding of recipes and images that yields impressive results on an image-recipe retrieval task. Moreover, we demonstrate that regularization via the addition of a high-level classification objective both improves retrieval performance to rival that of humans and enables semantic vector arithmetic. We postulate that these embeddings will provide a basis for further exploration of the Recipe1M+ dataset and food and cooking in general. Code, data and models are publicly available.

摘要

在本文中,我们介绍了Recipe1M+,这是一个新的大规模结构化语料库,包含超过一百万个烹饪食谱和一千三百万张食物图片。作为最大的公开可用食谱数据集,Recipe1M+能够让我们在对齐的多模态数据上训练高容量模型。利用这些数据,我们训练了一个神经网络,以学习食谱和图像的联合嵌入,在图像-食谱检索任务中取得了令人印象深刻的结果。此外,我们证明通过添加高级分类目标进行正则化,既能提高检索性能以媲美人类,又能实现语义向量运算。我们推测这些嵌入将为进一步探索Recipe1M+数据集以及一般的食物和烹饪奠定基础。代码、数据和模型均可公开获取。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验