Decyphering recipes

Decyphering recipes

Building a recipe ontology with neo4j

51b8b43f25bff2a344f0a85ecd847fdf?s=128

Gousto Tech

April 26, 2017
Tweet

Transcript

  1. DECYPHERING RECIPES Irene Iriarte Carretero Data Scientist Building a recipe

    ontology with NEO4J
  2. About Gousto • An online recipe box service. • Customers

    come to our site, or use our apps and select from 22 meals each week. • They pick the meals they want to cook and say how many people they’re cooking for. • We deliver all the ingredients they need in exact proportions with step-by-step recipe cards in 2-3 days. • No planning, no supermarkets and no food waste – you just cook (and eat). • We’re a rapidly growing business.
  3. challenge • We need to ensure that we offer customers

    balanced menus • When planning menus, we have to take many constraints into account: • Variety: There needs to be a range of proteins and cuisines • Operational: There are certain operational restrictions like lack of availability of certain ingredients • Collections: Menu needs to fulfil certain collections such as family and low calories • Hard to do this by hand!
  4. Solution • Proposed solution is a menu planning algorithm, which

    uses: • Genetic algorithms: algorithm used for multivariable optimisation, based on the process that drives biological evolution • Recipe graph DB: database connecting relations between recipes and ingredients • Genetic algorithm deals with fulfilling constraints and collections • Recipe ontology helps us understand recipes and our customers SIMILARITY CONSTRAINTS
  5. Recipe Similarity • It is hard to put a number

    to the similarity between two recipes • Part of the problem is that it is a very subjective issue • Using only ingredients in common does not accurately capture similarity A small example…
  6. None
  7. None
  8. None
  9. Recipe Similarity • It is hard to put a number

    to the similarity between two recipes • Part of the problem is that it is a very subjective issue • Using only ingredients in common does not accurately capture similarity • We need to be able to look at similarity from different points of view: • Ingredients • Cuisines • Presentation • Collections • (Customers’ taste) • Etc.
  10. NEO4J • We decided to use a graph database rather

    than relational because: • Recipe & ingredient attributes are strongly interconnected – being able to easily analyse the relations between data is important • We need flexibility in terms of capturing ingredient attributes • Allows us to easily create inferences from data attributes and relations • Cypher language allows for easy querying of the data
  11. BENCHMARKING • Calculate similarities by counting paths between recipes and

    assigning weights to different attributes • Hard to work out if the similarity between recipes we are calculating is reasonable • To work around this we set up a bot which asked Gousto employees to rate the similarity of certain recipes • This then allowed us to benchmark our algorithm results with those coming from humans
  12. Current Situation • We are currently in the process of

    fully implementing the menu planning algorithm • It is difficult to set up structure of graph database, as we needed to make sure we were capturing all the different recipe aspects • Currently investigating how to improve similarity estimations
  13. FUTURE OPPORTUNITIES • The graph database is the first step

    towards a recommendation engine • This will require adding customer purchases, taste scores, reviews etc. • Allow customers to search exactly for what they want • Curate recipes for dietary requirements
  14. @GoustoTech techbrunch.gousto.co.uk irene@gousto.co.uk Thank you for Listening!