Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Evolution of e-commerce search @ shopping24
Search
Torsten Bøgh Köster
November 19, 2014
Technology
0
1.2k
Evolution of e-commerce search @ shopping24
Held at the first Search Technology Meetup in Hamburg on November, 19th.
Torsten Bøgh Köster
November 19, 2014
Tweet
Share
More Decks by Torsten Bøgh Köster
See All by Torsten Bøgh Köster
Taking an abandoned Solr search from zero to GenAI hero
tboeghk
0
18
Oder mache ich es lieber selbst? Wie sich Kosten und Geopolitik auf Cloud-Betrieb auswirken
tboeghk
0
35
🔪 How we cut our AWS costs in half
tboeghk
0
270
Shared Nothing Logging Infrastructure
tboeghk
0
120
Beyond Cloud: A road trip into AWS and back to bare metal
tboeghk
1
100
Shared Nothing Logging Infrastructure
tboeghk
0
1.3k
Kubernetes the ❤️ way
tboeghk
0
1.1k
Beyond Cloud: A road trip into AWS and back to bare metal
tboeghk
0
100
Open-Source-Logging und -Monitoring (W-JAX 2017)
tboeghk
0
99
Other Decks in Technology
See All in Technology
「魔法少女まどか☆マギカ Magia Exedra」のグローバル展開を支える、開発チームと翻訳チームの「意識しない協創」を実現するローカライズシステム
gree_tech
PRO
0
580
ガチな登山用デバイスからこんにちは
halka
1
220
「魔法少女まどか☆マギカ Magia Exedra」の必殺技演出を徹底解剖! -キャラクターの魅力を最大限にファンに届けるためのこだわり-
gree_tech
PRO
0
590
オブザーバビリティが広げる AIOps の世界 / The World of AIOps Expanded by Observability
aoto
PRO
0
310
なぜテストマネージャの視点が 必要なのか? 〜 一歩先へ進むために 〜
moritamasami
0
150
生成AI時代のデータ基盤設計〜ペースレイヤリングで実現する高速開発と持続性〜 / Levtech Meetup_Session_2
sansan_randd
1
140
Automating Web Accessibility Testing with AI Agents
maminami373
0
1.1k
開発者を支える Internal Developer Portal のイマとコレカラ / To-day and To-morrow of Internal Developer Portals: Supporting Developers
aoto
PRO
1
370
生成AI時代のデータ基盤
shibuiwilliam
6
3.7k
allow_retry と Arel.sql / allow_retry and Arel.sql
euglena1215
1
160
広報における効果的なプロンプトエンジニアリング入門.pdf
suguruooki
0
110
【実演版】カンファレンス登壇者・スタッフにこそ知ってほしいマイクの使い方 / 大吉祥寺.pm 2025
arthur1
1
290
Featured
See All Featured
CSS Pre-Processors: Stylus, Less & Sass
bermonpainter
358
30k
Keith and Marios Guide to Fast Websites
keithpitt
411
22k
Producing Creativity
orderedlist
PRO
347
40k
CoffeeScript is Beautiful & I Never Want to Write Plain JavaScript Again
sstephenson
162
15k
Designing for Performance
lara
610
69k
Evolution of real-time – Irina Nazarova, EuRuKo, 2024
irinanazarova
8
910
A Tale of Four Properties
chriscoyier
160
23k
Documentation Writing (for coders)
carmenintech
74
5k
How to Ace a Technical Interview
jacobian
279
23k
RailsConf & Balkan Ruby 2019: The Past, Present, and Future of Rails at GitHub
eileencodes
139
34k
Product Roadmaps are Hard
iamctodd
PRO
54
11k
The Power of CSS Pseudo Elements
geoffreycrofte
77
5.9k
Transcript
Evolution of e-commerce search @ shopping24 Search Technology Meetup Hamburg
Torsten Bøgh Köster (Shopping24) 19. November 2014
Agenda Why search? Motivation & introduction Evolutionary steps taken Advanced
steps Pitfalls
@tboeghk ‣CTO shopping24 internet group ‣University of Hamburg, class of
2005 ‣Likes: search, build, delivery, code quality, road bike
None
Open Source Power. Delivered.
search system architecture overview
Fun fact: <1% visitors actually use the search bar.
Search enables automatic SEA scaling. But what about navigating afterwards?
Agenda Why search? Motivation & introduction Evolutionary steps taken Advanced
steps Pitfalls
Don’t get me started on tokenizing. Move expensive operations (synonyms,
stemming) to index time
German stemming: „Ein_ Geschicht_ voll__ Missverständniss_“: Refrain from Porter and
Snowball stemmer.
Extend recall using synonyms & subtopics, use edismax query parser
with boost terms for high precision. Consider reranking to penalize documents
3 approaches to navigating search results
use facetting to narrow a search result, use adaptive tree
structures
the direct spellchecker in Solr does a great job. Consider
word break. Avoid dictionaries, handle special cases using synonyms (+ custom code).
Use Solrs more like this. Supply terms in mlt request.
Works on >1 documents as well. Filter on gender (and categories).
remove terms from query and retry when hitting zero results.
Uses spellchecker & custom collators
Recycle Solr spellchecker infrastructure to retrieve related brands, categories &
searches.
Agenda Why search? Motivation & introduction Evolutionary steps taken Advanced
steps Pitfalls
TF/IDF ranking does not work for e-commerce search. Consider the
bmax query parser.
first impression matters: use solr grouping and expand to „fold“
similar products.
Separate data & ranking information. Retrieve ranking information from an
external data store (ExternalFileFieldType, RedisFieldType). Use boost functions to mix information retrieved. per document lookup
Agenda Why search? Motivation & introduction Evolutionary steps taken Advanced
steps Pitfalls
Visualize results for the target audience. Separate business from technical
views.
Custom code in Solr is failure by design. You will
inevitably hit garbage collection hell. GC will happen, deal with it.
Ultimate solution: issue replication slots to slaves. Perform Full GC
after cache warming.
Find us on github.com
Questions? @tboeghk developer.s24.com
[email protected]