Decyphering recipes

DECYPHERING RECIPES Irene Iriarte Carretero Data Scientist Building a recipe
ontology with NEO4J

About Gousto • An online recipe box service. • Customers
come to our site, or use our apps and select from 22 meals each week. • They pick the meals they want to cook and say how many people they’re cooking for. • We deliver all the ingredients they need in exact proportions with step-by-step recipe cards in 2-3 days. • No planning, no supermarkets and no food waste – you just cook (and eat). • We’re a rapidly growing business.

challenge • We need to ensure that we offer customers
balanced menus • When planning menus, we have to take many constraints into account: • Variety: There needs to be a range of proteins and cuisines • Operational: There are certain operational restrictions like lack of availability of certain ingredients • Collections: Menu needs to fulfil certain collections such as family and low calories • Hard to do this by hand!

Solution • Proposed solution is a menu planning algorithm, which
uses: • Genetic algorithms: algorithm used for multivariable optimisation, based on the process that drives biological evolution • Recipe graph DB: database connecting relations between recipes and ingredients • Genetic algorithm deals with fulfilling constraints and collections • Recipe ontology helps us understand recipes and our customers SIMILARITY CONSTRAINTS

Recipe Similarity • It is hard to put a number
to the similarity between two recipes • Part of the problem is that it is a very subjective issue • Using only ingredients in common does not accurately capture similarity A small example…

Recipe Similarity • It is hard to put a number
to the similarity between two recipes • Part of the problem is that it is a very subjective issue • Using only ingredients in common does not accurately capture similarity • We need to be able to look at similarity from different points of view: • Ingredients • Cuisines • Presentation • Collections • (Customers’ taste) • Etc.

NEO4J • We decided to use a graph database rather
than relational because: • Recipe & ingredient attributes are strongly interconnected – being able to easily analyse the relations between data is important • We need flexibility in terms of capturing ingredient attributes • Allows us to easily create inferences from data attributes and relations • Cypher language allows for easy querying of the data

BENCHMARKING • Calculate similarities by counting paths between recipes and
assigning weights to different attributes • Hard to work out if the similarity between recipes we are calculating is reasonable • To work around this we set up a bot which asked Gousto employees to rate the similarity of certain recipes • This then allowed us to benchmark our algorithm results with those coming from humans

Current Situation • We are currently in the process of
fully implementing the menu planning algorithm • It is difficult to set up structure of graph database, as we needed to make sure we were capturing all the different recipe aspects • Currently investigating how to improve similarity estimations

FUTURE OPPORTUNITIES • The graph database is the first step
towards a recommendation engine • This will require adding customer purchases, taste scores, reviews etc. • Allow customers to search exactly for what they want • Curate recipes for dietary requirements

@GoustoTech techbrunch.gousto.co.uk [email protected] Thank you for Listening!

Decyphering recipes

Decyphering recipes

Gousto Tech

More Decks by Gousto Tech

Other Decks in Technology

Featured

Transcript

DECYPHERING RECIPES Irene Iriarte Carretero Data Scientist Building a recipe

About Gousto • An online recipe box service. • Customers

challenge • We need to ensure that we offer customers

Solution • Proposed solution is a menu planning algorithm, which

Recipe Similarity • It is hard to put a number

Recipe Similarity • It is hard to put a number

NEO4J • We decided to use a graph database rather

BENCHMARKING • Calculate similarities by counting paths between recipes and

Current Situation • We are currently in the process of

FUTURE OPPORTUNITIES • The graph database is the first step

@GoustoTech techbrunch.gousto.co.uk [email protected] Thank you for Listening!