[Paper Reading] Inverse Cooking: Recipe Generation from Food Images

Inverse Cooking: Recipe Generation from Food Images Amaia Salvador, Michal
Drozdzal, Xavier Giro-i-Nieto, Adriana Romero Universitat Politecnica de Catalunya, Facebook AI Research @CVPR2019 Presented by Huy Van 1 Paper Reading 2019

1. Introduction Problem • too much food photos but limited
detailed information about food Solution • Inverse Cooking: infers ingredients and cooking instructions from a food photo 2 Paper Reading 2019

Contributions • Presents a system which generates cooking instructions on
an image and its ingredients • Studies ingredients as both a list and a set, and proposes a new architecture for ingredient prediction that exploits co-dependencies among ingredients without imposing order • Superior than image-to-recipe retrieval approaches in ingredient predictions 3 Paper Reading 2019

2. Related Work Food Understanding • Large scale datasets: Food-101
and Recipe1M • Visual food recognition • Esitmating the number of calories given a food image • Predicting the list of present ingredients • Finding the recipe for a given image Multi-label Classiﬁcation Conditional Text Generation 4 Paper Reading 2019

3. Genarating Recipes from Images 5 Paper Reading 2019

Cooking Instruction Transformer 6 Paper Reading 2019

Ingredient Decoder • Ingredients as a list (ordered) • Ingredients
as a set (unordered) 7 Paper Reading 2019

Ingredients as a List • Present each ingredient as a
one-hot vector • Use a transformer model 8 Paper Reading 2019

Ingredients as a Set • Method 1: set transformer •
Use transformer as above • To remove the order → aggregate the outputs accross different time-steps by using a max pooling operation • Method 2: • Present ingredient output as a binary set then convert to a target distribution • Use feed forward network with cross- entropy loss 9 Paper Reading 2019

Optimization 2 stages: - pre-train the image encoder and ingredients
decoder - train the ingredient encoder and instruction decoder 10 Paper Reading 2019

4. Experiments Dataset • Recipe1M: includes ~1M recipes • Preprocessing:
• rule-base to reduce ingredients from 16,823 to 1,488 • tokenize raw text and remove less frequent words → 23,231 words 11 Paper Reading 2019

Results: Recipe Generation 12 Paper Reading 2019

Results: Ingredient Prediction 13 Paper Reading 2019

Results: Generation vs Retrieval 14 Paper Reading 2019

Results: User Studies 15 Paper Reading 2019

5. Conclusion • Introduced an image-to-recipe generation system, which takes
a food image and produces a recipe consisting of a title, ingredients and sequence of cooking instructions • First predicted sets of ingredients from food images, showing that modeling dependencies matters • Then explored instruction generation conditioned on images and inferred ingredients, highlighting the importance of reasoning about both modalities at the same time • Finally, user study results conﬁrm the difﬁculty of the task, and demonstrate the superiority of the system against state- of-the-art image-to-recipe retrieval approaches 16 Paper Reading 2019

[Paper Reading] Inverse Cooking: Recipe Generat...

[Paper Reading] Inverse Cooking: Recipe Generation from Food Images

Huy Van

More Decks by Huy Van

Other Decks in Research

Featured

Transcript

Inverse Cooking: Recipe Generation from Food Images Amaia Salvador, Michal

1. Introduction Problem • too much food photos but limited

Contributions • Presents a system which generates cooking instructions on

2. Related Work Food Understanding • Large scale datasets: Food-101

3. Genarating Recipes from Images 5 Paper Reading 2019

Cooking Instruction Transformer 6 Paper Reading 2019

Ingredient Decoder • Ingredients as a list (ordered) • Ingredients

Ingredients as a List • Present each ingredient as a

Ingredients as a Set • Method 1: set transformer •

Optimization 2 stages: - pre-train the image encoder and ingredients

4. Experiments Dataset • Recipe1M: includes ~1M recipes • Preprocessing:

Results: Recipe Generation 12 Paper Reading 2019

Results: Ingredient Prediction 13 Paper Reading 2019

Results: Generation vs Retrieval 14 Paper Reading 2019

Results: User Studies 15 Paper Reading 2019

5. Conclusion • Introduced an image-to-recipe generation system, which takes