Slide 1

Slide 1 text

Go Global Search 2 Orgil Davaajargal オリギル Cookpad Global 1

Slide 2

Slide 2 text

2 $ whoami Orgil オリギル ● from Mongolia 󰐆 ● Joined 2016 as a newgrad 🌸 ● Cookpad Global, Search engineer 󰡷 ● Bouldering󰩥, Photography📷, Robot anime🤖 ● Favorite Gundam series: Z Gundam

Slide 3

Slide 3 text

Agenda ● About Cookpad Global, Global Search ● New platform global-search-2 (GS2) ● How search team has changed ● Summary 3

Slide 4

Slide 4 text

4 Cookpad Global

Slide 5

Slide 5 text

Cookpad Global We do a recipe service in 30+ language, 70+ countries 100M+ monthly users Completely separate platform than JP We believe that the world would be a better place if there were more creators 5

Slide 6

Slide 6 text

6 30+ languages 70+ countries

Slide 7

Slide 7 text

Almost 7M recipes! 7 5x growth in last 5 year

Slide 8

Slide 8 text

8 Cookpad HQ UK, Bristol Offices in 10 countries

Slide 9

Slide 9 text

Diverse team from all over the world 9

Slide 10

Slide 10 text


Slide 11

Slide 11 text

RTL - Right To Left language 11 Arabic

Slide 12

Slide 12 text

Challenges in 30+ languages Unicode normalizations Tokenizations Stemming Dictionary Grammars Compound words Continuous languages 12 More about internationalizations

Slide 13

Slide 13 text

Global Search V2 (GS2) 13

Slide 14

Slide 14 text

About Global search V2(GS2) project Old search(GS1) was rails app with elasticsearch We renewed the search backend with whole new tech stack 14 5.x 7.x Hako

Slide 15

Slide 15 text

Before 15

Slide 16

Slide 16 text

After 16

Slide 17

Slide 17 text

Python! Kubernetes! Kafka! Elasticsearch 7! ML API! AB testing! Microservices & Devops! Changes in GS2 17 7.x

Slide 18

Slide 18 text

Why GS2? More product development for search more search engineering resources needed Hiring Hard to find Rails + Elasticsearch + Relevance engineer Want to use ML and other new techs Faster iteration cycles, AB testing GS1 was not good at AB testing Python has big NLP libraries easier ML research integration, lang detection lib etc 18

Slide 19

Slide 19 text

19 GS2 features

Slide 20

Slide 20 text

● kafka: at least once ○ make sure all events are delivered ● elasticsearch: document versioning ○ prevent overwriting when running reingestion Ingestion pipeline kafka event streaming Faust - python stream processing avro schema to share the event structure near real time indexing 20

Slide 21

Slide 21 text

No more shared DB! 21

Slide 22

Slide 22 text

Enrichment pipeline enriches recipe document ML APIs ● image attractiveness ● text embeddings ● recipe categorization ● … 22

Slide 23

Slide 23 text

AB testing Using feature flag system Grouping request by GUID for 16 channels Test by language, country 23

Slide 24

Slide 24 text

AB test analysis done easily Data platform can handle the experiment channel and show the result 24

Slide 25

Slide 25 text

k8s infra - team owns its resource 25

Slide 26

Slide 26 text

Search Portal Admin tool dictionary management debugging the search result KPI metrics stemming fixes stopword list Important changes are moderated by engineer 26

Slide 27

Slide 27 text

SSMs - Search Success Managers ● Local community manager responsible for search ● In GS2 it is more dedicated role ● Has a deep understanding of how dictionaries work ● Holds a workshop regularly to share the knowledge 27 Local person is responsible for final search adjustment using dictionary tool Engineers can't handle 30+ languages, cultures I can't read these: เนื้อไก चकन ચકન . . .

Slide 28

Slide 28 text

Debugging search results easier Added more debugging capabilities SSMs can solve their problem on their own Engineers can notice strange behavior 28

Slide 29

Slide 29 text

ML challenges ML enrichment category prediction Q2Q ingredient expansion recipe recommendation ingredient normalization . . . 29 enrichments.classification_tagging.tag = "diet" enrichments.classification_tagging.weight > 0.8

Slide 30

Slide 30 text

ML challenges ML enrichment category prediction Q2Q ingredient expansion recipe recommendation ingredient normalization . . . 30 Q2Q(query to query) suggestion suggestion to narrow down result

Slide 31

Slide 31 text

31 How we work has changed a lot too

Slide 32

Slide 32 text

In GS1 the team became reactive Increasing languages, traffic, product demand reactive to search inquiry, product requests had to rely on external team to make changes 32

Slide 33

Slide 33 text

Proactive team with more capabilities More people work on search Dedicated platform team Web app engineers 33

Slide 34

Slide 34 text

#ask-search channel The inquiries from CMs is investigated, answered systematically Question about dictionary management -> SSMs help each other Bug or strange behavior of the search -> Search Ranger rotation 34

Slide 35

Slide 35 text

Responsibility change We own our infrastructure Alerting, SLOs On-call shift Owns the quality of search, metrics Proactively acts to improve search 35

Slide 36

Slide 36 text

36 Philosophy for teams building solutions in search: You build it, you own it, you run it.

Slide 37

Slide 37 text

37 Summary ● Renewal was needed for bigger challenge ● GS2 was built on completely new tech stack ● It has capabilities to support further endeavour ● The team has more responsibilities

Slide 38

Slide 38 text

Check our Global Tech blog - Source Diving 38

Slide 39

Slide 39 text


Slide 40

Slide 40 text

40 Go Global Search 2

Slide 41

Slide 41 text