Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Evolution of e-commerce search @ shopping24

Evolution of e-commerce search @ shopping24

Held at the first Search Technology Meetup in Hamburg on November, 19th.

A6bb61c55fa41db28e68cd476cb54ab9?s=128

Torsten Bøgh Köster

November 19, 2014
Tweet

Transcript

  1. Evolution of e-commerce search @ shopping24 Search Technology Meetup Hamburg

    Torsten Bøgh Köster (Shopping24) 19. November 2014
  2. Agenda Why search? Motivation & introduction Evolutionary steps taken Advanced

    steps Pitfalls
  3. @tboeghk ‣CTO shopping24 internet group ‣University of Hamburg, class of

    2005 ‣Likes: search, build, delivery, code quality, road bike
  4. None
  5. Open Source Power. Delivered.

  6. search system architecture overview

  7. Fun fact: <1% visitors actually use the search bar.

  8. Search enables automatic SEA scaling. But what about navigating afterwards?

  9. Agenda Why search? Motivation & introduction Evolutionary steps taken Advanced

    steps Pitfalls
  10. Don’t get me started on tokenizing. Move expensive operations (synonyms,

    stemming) to index time
  11. German stemming: „Ein_ Geschicht_ voll__ Missverständniss_“: Refrain from Porter and

    Snowball stemmer.
  12. Extend recall using synonyms & subtopics, use edismax query parser

    with boost terms for high precision. Consider reranking to penalize documents
  13. 3 approaches to navigating search results

  14. use facetting to narrow a search result, use adaptive tree

    structures
  15. the direct spellchecker in Solr does a great job. Consider

    word break. Avoid dictionaries, handle special cases using synonyms (+ custom code).
  16. Use Solrs more like this. Supply terms in mlt request.

    Works on >1 documents as well. Filter on gender (and categories).
  17. remove terms from query and retry when hitting zero results.

    Uses spellchecker & custom collators
  18. Recycle Solr spellchecker infrastructure to retrieve related brands, categories &

    searches.
  19. Agenda Why search? Motivation & introduction Evolutionary steps taken Advanced

    steps Pitfalls
  20. TF/IDF ranking does not work for e-commerce search. Consider the

    bmax query parser.
  21. first impression matters: use solr grouping and expand to „fold“

    similar products.
  22. Separate data & ranking information. Retrieve ranking information from an

    external data store (ExternalFileFieldType, RedisFieldType). Use boost functions to mix information retrieved. per document lookup
  23. Agenda Why search? Motivation & introduction Evolutionary steps taken Advanced

    steps Pitfalls
  24. Visualize results for the target audience. Separate business from technical

    views.
  25. Custom code in Solr is failure by design. You will

    inevitably hit garbage collection hell. GC will happen, deal with it.
  26. Ultimate solution: issue replication slots to slaves. Perform Full GC

    after cache warming.
  27. Find us on github.com

  28. Questions? @tboeghk developer.s24.com torsten.koester@s24.com