Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Go Global Search 2

Orgil Dj
December 01, 2022

Go Global Search 2

About Cookpad Global's search system.
#CookpadTechConf 2022

Orgil Dj

December 01, 2022

More Decks by Orgil Dj

Other Decks in Programming


  1. 2 $ whoami Orgil オリギル • from Mongolia 󰐆 •

    Joined 2016 as a newgrad 🌸 • Cookpad Global, Search engineer 󰡷 • Bouldering󰩥, Photography📷, Robot anime🤖 • Favorite Gundam series: Z Gundam https://cookpad.com/recipe/730175
  2. Agenda • About Cookpad Global, Global Search • New platform

    global-search-2 (GS2) • How search team has changed • Summary 3
  3. Cookpad Global We do a recipe service in 30+ language,

    70+ countries 100M+ monthly users Completely separate platform than JP We believe that the world would be a better place if there were more creators 5
  4. 10

  5. Challenges in 30+ languages Unicode normalizations Tokenizations Stemming Dictionary Grammars

    Compound words Continuous languages 12 https://techconf.cookpad.com/2017/rejasupotaro.html More about internationalizations
  6. About Global search V2(GS2) project Old search(GS1) was rails app

    with elasticsearch We renewed the search backend with whole new tech stack 14 5.x 7.x Hako
  7. Why GS2? More product development for search more search engineering

    resources needed Hiring Hard to find Rails + Elasticsearch + Relevance engineer Want to use ML and other new techs Faster iteration cycles, AB testing GS1 was not good at AB testing Python has big NLP libraries easier ML research integration, lang detection lib etc 18
  8. • kafka: at least once ◦ make sure all events

    are delivered • elasticsearch: document versioning ◦ prevent overwriting when running reingestion Ingestion pipeline kafka event streaming Faust - python stream processing avro schema to share the event structure near real time indexing 20
  9. Enrichment pipeline enriches recipe document ML APIs • image attractiveness

    • text embeddings • recipe categorization • … 22
  10. AB testing Using feature flag system Grouping request by GUID

    for 16 channels Test by language, country 23
  11. AB test analysis done easily Data platform can handle the

    experiment channel and show the result 24
  12. Search Portal Admin tool dictionary management debugging the search result

    KPI metrics stemming fixes stopword list Important changes are moderated by engineer 26
  13. SSMs - Search Success Managers • Local community manager responsible

    for search • In GS2 it is more dedicated role • Has a deep understanding of how dictionaries work • Holds a workshop regularly to share the knowledge 27 Local person is responsible for final search adjustment using dictionary tool Engineers can't handle 30+ languages, cultures I can't read these: เนื้อไก चकन ચકન . . .
  14. Debugging search results easier Added more debugging capabilities SSMs can

    solve their problem on their own Engineers can notice strange behavior 28
  15. ML challenges ML enrichment category prediction Q2Q ingredient expansion recipe

    recommendation ingredient normalization . . . 29 enrichments.classification_tagging.tag = "diet" enrichments.classification_tagging.weight > 0.8
  16. ML challenges ML enrichment category prediction Q2Q ingredient expansion recipe

    recommendation ingredient normalization . . . 30 Q2Q(query to query) suggestion suggestion to narrow down result
  17. In GS1 the team became reactive Increasing languages, traffic, product

    demand reactive to search inquiry, product requests had to rely on external team to make changes 32
  18. Proactive team with more capabilities More people work on search

    Dedicated platform team Web app engineers 33
  19. #ask-search channel The inquiries from CMs is investigated, answered systematically

    Question about dictionary management -> SSMs help each other Bug or strange behavior of the search -> Search Ranger rotation 34
  20. Responsibility change We own our infrastructure Alerting, SLOs On-call shift

    Owns the quality of search, metrics Proactively acts to improve search 35
  21. 37 Summary • Renewal was needed for bigger challenge •

    GS2 was built on completely new tech stack • It has capabilities to support further endeavour • The team has more responsibilities
  22. 39