Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Go Global Search 2

Orgil Dj
December 01, 2022

Go Global Search 2

About Cookpad Global's search system.
#CookpadTechConf 2022

Orgil Dj

December 01, 2022
Tweet

More Decks by Orgil Dj

Other Decks in Programming

Transcript

  1. Go Global Search 2 Orgil Davaajargal オリギル Cookpad Global 1

  2. 2 $ whoami Orgil オリギル • from Mongolia 󰐆 •

    Joined 2016 as a newgrad 🌸 • Cookpad Global, Search engineer 󰡷 • Bouldering󰩥, Photography📷, Robot anime🤖 • Favorite Gundam series: Z Gundam https://cookpad.com/recipe/730175
  3. Agenda • About Cookpad Global, Global Search • New platform

    global-search-2 (GS2) • How search team has changed • Summary 3
  4. 4 Cookpad Global

  5. Cookpad Global We do a recipe service in 30+ language,

    70+ countries 100M+ monthly users Completely separate platform than JP We believe that the world would be a better place if there were more creators 5
  6. 6 30+ languages 70+ countries

  7. Almost 7M recipes! 7 5x growth in last 5 year

  8. 8 Cookpad HQ UK, Bristol Offices in 10 countries

  9. Diverse team from all over the world 9

  10. 10

  11. RTL - Right To Left language 11 Arabic https://cookpad.com/sa

  12. Challenges in 30+ languages Unicode normalizations Tokenizations Stemming Dictionary Grammars

    Compound words Continuous languages 12 https://techconf.cookpad.com/2017/rejasupotaro.html More about internationalizations
  13. Global Search V2 (GS2) 13

  14. About Global search V2(GS2) project Old search(GS1) was rails app

    with elasticsearch We renewed the search backend with whole new tech stack 14 5.x 7.x Hako
  15. Before 15

  16. After 16

  17. Python! Kubernetes! Kafka! Elasticsearch 7! ML API! AB testing! Microservices

    & Devops! Changes in GS2 17 7.x
  18. Why GS2? More product development for search more search engineering

    resources needed Hiring Hard to find Rails + Elasticsearch + Relevance engineer Want to use ML and other new techs Faster iteration cycles, AB testing GS1 was not good at AB testing Python has big NLP libraries easier ML research integration, lang detection lib etc 18
  19. 19 GS2 features

  20. • kafka: at least once ◦ make sure all events

    are delivered • elasticsearch: document versioning ◦ prevent overwriting when running reingestion Ingestion pipeline kafka event streaming Faust - python stream processing avro schema to share the event structure near real time indexing 20
  21. No more shared DB! 21

  22. Enrichment pipeline enriches recipe document ML APIs • image attractiveness

    • text embeddings • recipe categorization • … 22
  23. AB testing Using feature flag system Grouping request by GUID

    for 16 channels Test by language, country 23
  24. AB test analysis done easily Data platform can handle the

    experiment channel and show the result 24
  25. k8s infra - team owns its resource 25 https://sourcediving.com/search-at-cookpad-building-new-infrastructure-dc58f4eab93f

  26. Search Portal Admin tool dictionary management debugging the search result

    KPI metrics stemming fixes stopword list Important changes are moderated by engineer 26
  27. SSMs - Search Success Managers • Local community manager responsible

    for search • In GS2 it is more dedicated role • Has a deep understanding of how dictionaries work • Holds a workshop regularly to share the knowledge 27 Local person is responsible for final search adjustment using dictionary tool Engineers can't handle 30+ languages, cultures I can't read these: เนื้อไก चकन ચકન . . .
  28. Debugging search results easier Added more debugging capabilities SSMs can

    solve their problem on their own Engineers can notice strange behavior 28
  29. ML challenges ML enrichment category prediction Q2Q ingredient expansion recipe

    recommendation ingredient normalization . . . 29 enrichments.classification_tagging.tag = "diet" enrichments.classification_tagging.weight > 0.8
  30. ML challenges ML enrichment category prediction Q2Q ingredient expansion recipe

    recommendation ingredient normalization . . . 30 Q2Q(query to query) suggestion suggestion to narrow down result
  31. 31 How we work has changed a lot too

  32. In GS1 the team became reactive Increasing languages, traffic, product

    demand reactive to search inquiry, product requests had to rely on external team to make changes 32
  33. Proactive team with more capabilities More people work on search

    Dedicated platform team Web app engineers 33
  34. #ask-search channel The inquiries from CMs is investigated, answered systematically

    Question about dictionary management -> SSMs help each other Bug or strange behavior of the search -> Search Ranger rotation 34
  35. Responsibility change We own our infrastructure Alerting, SLOs On-call shift

    Owns the quality of search, metrics Proactively acts to improve search 35
  36. 36 Philosophy for teams building solutions in search: You build

    it, you own it, you run it.
  37. 37 Summary • Renewal was needed for bigger challenge •

    GS2 was built on completely new tech stack • It has capabilities to support further endeavour • The team has more responsibilities
  38. Check our Global Tech blog - Source Diving 38 https://sourcediving.com/

  39. 39

  40. 40 Go Global Search 2

  41. 41 THANK YOU FOR LISTENING