Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Evolution of e-commerce search @ shopping24
Search
Torsten Bøgh Köster
November 19, 2014
Technology
1.2k
0
Share
Evolution of e-commerce search @ shopping24
Held at the first Search Technology Meetup in Hamburg on November, 19th.
Torsten Bøgh Köster
November 19, 2014
More Decks by Torsten Bøgh Köster
See All by Torsten Bøgh Köster
LLMs im Griff: Observability, Tracing und Security
tboeghk
0
28
LLMs im Griff: Observability, Tracing und Security
tboeghk
0
40
Oder mache ich es lieber selbst? Wie sich Kosten und Geopolitik auf Cloud-Betrieb auswirken
tboeghk
0
34
Taking an abandoned Solr search from zero to GenAI hero
tboeghk
0
47
Oder mache ich es lieber selbst? Wie sich Kosten und Geopolitik auf Cloud-Betrieb auswirken
tboeghk
0
52
🔪 How we cut our AWS costs in half
tboeghk
0
380
Shared Nothing Logging Infrastructure
tboeghk
0
130
Beyond Cloud: A road trip into AWS and back to bare metal
tboeghk
1
110
Shared Nothing Logging Infrastructure
tboeghk
0
1.4k
Other Decks in Technology
See All in Technology
React Compiler導入から21ヶ月、いま始めるならこうやる
astatsuya
2
290
ジュニアエンジニアはSREとどう向き合うべきか
nrinetcom
PRO
0
100
10サービス以上のメール到達率改善を地道に継続的に進めている話 / Continue to improve email delivery rates across multiple services
yamaguchitk333
6
2.3k
ワールドカフェ再び、そしてゴール・ルール・ロール・ツール / World Café Revisited, and the Goals-Rules-Roles-Tools
ks91
PRO
0
190
Personal knowledge bases using LLM
lycorptech_jp
PRO
0
120
実践 TanStack Start ― 新規プロダクトを開発して確立した、サーバーとクライアント境界の設計パターン / Practical TanStack Start Server-Client Boundary Patterns
kaminashi
2
170
障害対応のRunbookは作った、でも本当に動くの? AWS FIS で EKS の AZ 障害を再現してみた
tk3fftk
0
120
AWSアップデートから考える継続的な運用改善
toru_kubota
2
330
SDDで⾒える、AIコーディングの"内訳"
lycorptech_jp
PRO
0
130
AsyncStreamでマルチブロードキャストを実装する
1mash0
1
180
Claude Code x Accounting
kawaguti
PRO
0
160
ラズパイ & Picoで入門:Zephyr(RTOS)の環境構築からビルドまでの紹介
iotengineer22
0
170
Featured
See All Featured
コードの90%をAIが書く世界で何が待っているのか / What awaits us in a world where 90% of the code is written by AI
rkaga
61
44k
brightonSEO & MeasureFest 2025 - Christian Goodrich - Winning strategies for Black Friday CRO & PPC
cargoodrich
3
700
Design of three-dimensional binary manipulators for pick-and-place task avoiding obstacles (IECON2024)
konakalab
0
430
How GitHub (no longer) Works
holman
316
150k
The Illustrated Children's Guide to Kubernetes
chrisshort
51
52k
Unsuck your backbone
ammeep
672
58k
Documentation Writing (for coders)
carmenintech
77
5.3k
Jamie Indigo - Trashchat’s Guide to Black Boxes: Technical SEO Tactics for LLMs
techseoconnect
PRO
0
140
The Mindset for Success: Future Career Progression
greggifford
PRO
0
330
ラッコキーワード サービス紹介資料
rakko
1
3.3M
Writing Fast Ruby
sferik
630
63k
Discover your Explorer Soul
emna__ayadi
2
1.1k
Transcript
Evolution of e-commerce search @ shopping24 Search Technology Meetup Hamburg
Torsten Bøgh Köster (Shopping24) 19. November 2014
Agenda Why search? Motivation & introduction Evolutionary steps taken Advanced
steps Pitfalls
@tboeghk ‣CTO shopping24 internet group ‣University of Hamburg, class of
2005 ‣Likes: search, build, delivery, code quality, road bike
None
Open Source Power. Delivered.
search system architecture overview
Fun fact: <1% visitors actually use the search bar.
Search enables automatic SEA scaling. But what about navigating afterwards?
Agenda Why search? Motivation & introduction Evolutionary steps taken Advanced
steps Pitfalls
Don’t get me started on tokenizing. Move expensive operations (synonyms,
stemming) to index time
German stemming: „Ein_ Geschicht_ voll__ Missverständniss_“: Refrain from Porter and
Snowball stemmer.
Extend recall using synonyms & subtopics, use edismax query parser
with boost terms for high precision. Consider reranking to penalize documents
3 approaches to navigating search results
use facetting to narrow a search result, use adaptive tree
structures
the direct spellchecker in Solr does a great job. Consider
word break. Avoid dictionaries, handle special cases using synonyms (+ custom code).
Use Solrs more like this. Supply terms in mlt request.
Works on >1 documents as well. Filter on gender (and categories).
remove terms from query and retry when hitting zero results.
Uses spellchecker & custom collators
Recycle Solr spellchecker infrastructure to retrieve related brands, categories &
searches.
Agenda Why search? Motivation & introduction Evolutionary steps taken Advanced
steps Pitfalls
TF/IDF ranking does not work for e-commerce search. Consider the
bmax query parser.
first impression matters: use solr grouping and expand to „fold“
similar products.
Separate data & ranking information. Retrieve ranking information from an
external data store (ExternalFileFieldType, RedisFieldType). Use boost functions to mix information retrieved. per document lookup
Agenda Why search? Motivation & introduction Evolutionary steps taken Advanced
steps Pitfalls
Visualize results for the target audience. Separate business from technical
views.
Custom code in Solr is failure by design. You will
inevitably hit garbage collection hell. GC will happen, deal with it.
Ultimate solution: issue replication slots to slaves. Perform Full GC
after cache warming.
Find us on github.com
Questions? @tboeghk developer.s24.com
[email protected]