Upgrade to Pro — share decks privately, control downloads, hide ads and more …

[Paper Reading] Inverse Cooking: Recipe Generat...

Avatar for Huy Van Huy Van
August 18, 2019

[Paper Reading] Inverse Cooking: Recipe Generation from Food Images

Present "Inverse Cooking: Recipe Generation from Food Images" paper at Paper Reading study group

Avatar for Huy Van

Huy Van

August 18, 2019
Tweet

More Decks by Huy Van

Other Decks in Research

Transcript

  1. Inverse Cooking: Recipe Generation from Food Images Amaia Salvador, Michal

    Drozdzal, Xavier Giro-i-Nieto, Adriana Romero Universitat Politecnica de Catalunya, Facebook AI Research @CVPR2019 Presented by Huy Van 1 Paper Reading 2019
  2. 1. Introduction Problem • too much food photos but limited

    detailed information about food Solution • Inverse Cooking: infers ingredients and cooking instructions from a food photo 2 Paper Reading 2019
  3. Contributions • Presents a system which generates cooking instructions on

    an image and its ingredients • Studies ingredients as both a list and a set, and proposes a new architecture for ingredient prediction that exploits co-dependencies among ingredients without imposing order • Superior than image-to-recipe retrieval approaches in ingredient predictions 3 Paper Reading 2019
  4. 2. Related Work Food Understanding • Large scale datasets: Food-101

    and Recipe1M • Visual food recognition • Esitmating the number of calories given a food image • Predicting the list of present ingredients • Finding the recipe for a given image Multi-label Classification Conditional Text Generation 4 Paper Reading 2019
  5. Ingredients as a List • Present each ingredient as a

    one-hot vector • Use a transformer model 8 Paper Reading 2019
  6. Ingredients as a Set • Method 1: set transformer •

    Use transformer as above • To remove the order → aggregate the outputs accross different time-steps by using a max pooling operation • Method 2: • Present ingredient output as a binary set then convert to a target distribution • Use feed forward network with cross- entropy loss 9 Paper Reading 2019
  7. Optimization 2 stages: - pre-train the image encoder and ingredients

    decoder - train the ingredient encoder and instruction decoder 10 Paper Reading 2019
  8. 4. Experiments Dataset • Recipe1M: includes ~1M recipes • Preprocessing:

    • rule-base to reduce ingredients from 16,823 to 1,488 • tokenize raw text and remove less frequent words → 23,231 words 11 Paper Reading 2019
  9. 5. Conclusion • Introduced an image-to-recipe generation system, which takes

    a food image and produces a recipe consisting of a title, ingredients and sequence of cooking instructions • First predicted sets of ingredients from food images, showing that modeling dependencies matters • Then explored instruction generation conditioned on images and inferred ingredients, highlighting the importance of reasoning about both modalities at the same time • Finally, user study results confirm the difficulty of the task, and demonstrate the superiority of the system against state- of-the-art image-to-recipe retrieval approaches 16 Paper Reading 2019