Drozdzal, Xavier Giro-i-Nieto, Adriana Romero Universitat Politecnica de Catalunya, Facebook AI Research @CVPR2019 Presented by Huy Van 1 Paper Reading 2019
an image and its ingredients • Studies ingredients as both a list and a set, and proposes a new architecture for ingredient prediction that exploits co-dependencies among ingredients without imposing order • Superior than image-to-recipe retrieval approaches in ingredient predictions 3 Paper Reading 2019
and Recipe1M • Visual food recognition • Esitmating the number of calories given a food image • Predicting the list of present ingredients • Finding the recipe for a given image Multi-label Classification Conditional Text Generation 4 Paper Reading 2019
Use transformer as above • To remove the order → aggregate the outputs accross different time-steps by using a max pooling operation • Method 2: • Present ingredient output as a binary set then convert to a target distribution • Use feed forward network with cross- entropy loss 9 Paper Reading 2019
a food image and produces a recipe consisting of a title, ingredients and sequence of cooking instructions • First predicted sets of ingredients from food images, showing that modeling dependencies matters • Then explored instruction generation conditioned on images and inferred ingredients, highlighting the importance of reasoning about both modalities at the same time • Finally, user study results confirm the difficulty of the task, and demonstrate the superiority of the system against state- of-the-art image-to-recipe retrieval approaches 16 Paper Reading 2019