Marin Javier, Biswas Aritro, Ofli Ferda, Hynes Nicholas, Salvador Amaia, Aytar Yusuf, Weber Ingmar, Torralba Antonio
IEEE Trans Pattern Anal Mach Intell. 2019 Jul 9. doi: 10.1109/TPAMI.2019.2927476.
In this paper, we introduce Recipe1M+, a new large-scale, structured corpus of over one million cooking recipes and 13 million food images. As the largest publicly available collection of recipe data, Recipe1M+ affords the ability to train high-capacity models on aligned, multi-modal data. Using these data, we train a neural network to learn a joint embedding of recipes and images that yields impressive results on an image-recipe retrieval task. Moreover, we demonstrate that regularization via the addition of a high-level classification objective both improves retrieval performance to rival that of humans and enables semantic vector arithmetic. We postulate that these embeddings will provide a basis for further exploration of the Recipe1M+ dataset and food and cooking in general. Code, data and models are publicly available.
在本文中,我们介绍了Recipe1M+,这是一个新的大规模结构化语料库,包含超过一百万个烹饪食谱和一千三百万张食物图片。作为最大的公开可用食谱数据集,Recipe1M+能够让我们在对齐的多模态数据上训练高容量模型。利用这些数据,我们训练了一个神经网络,以学习食谱和图像的联合嵌入,在图像-食谱检索任务中取得了令人印象深刻的结果。此外,我们证明通过添加高级分类目标进行正则化,既能提高检索性能以媲美人类,又能实现语义向量运算。我们推测这些嵌入将为进一步探索Recipe1M+数据集以及一般的食物和烹饪奠定基础。代码、数据和模型均可公开获取。