Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Evolution of e-commerce search @ shopping24
Search
Torsten Bøgh Köster
November 19, 2014
Technology
0
1.2k
Evolution of e-commerce search @ shopping24
Held at the first Search Technology Meetup in Hamburg on November, 19th.
Torsten Bøgh Köster
November 19, 2014
Tweet
Share
More Decks by Torsten Bøgh Köster
See All by Torsten Bøgh Köster
🔪 How we cut our AWS costs in half
tboeghk
0
170
Shared Nothing Logging Infrastructure
tboeghk
0
100
Beyond Cloud: A road trip into AWS and back to bare metal
tboeghk
1
68
Shared Nothing Logging Infrastructure
tboeghk
0
1.2k
Kubernetes the ❤️ way
tboeghk
0
960
Beyond Cloud: A road trip into AWS and back to bare metal
tboeghk
0
85
Open-Source-Logging und -Monitoring (W-JAX 2017)
tboeghk
0
91
Beyond Cloud (W-JAX 2017)
tboeghk
0
130
Open Source Logging & Monitoring (code.talks 2017)
tboeghk
0
95
Other Decks in Technology
See All in Technology
[IBM TechXchange Dojo]Watson Discoveryとwatsonx.aiでRAGを実現!事例のご紹介+座学②
siyuanzh09
0
110
Alignment and Autonomy in Cybozu - 300人の開発組織でアラインメントと自律性を両立させるアジャイルな組織運営 / RSGT2025
ama_ch
1
2.4k
JAWS-UG20250116_iOSアプリエンジニアがAWSreInventに行ってきた(真面目編)
totokit4
0
140
Azureの開発で辛いところ
re3turn
0
240
2024AWSで個人的にアツかったアップデート
nagisa53
1
110
デザインシステムを始めるために取り組んだこと - TechTrain x ゆめみ ここを意識してほしい!リファクタリング勉強会
kajitack
1
100
Git scrapingで始める継続的なデータ追跡 / Git Scraping
ohbarye
5
500
My small contributions - Fujiwara Tech Conference 2025
ijin
0
1.5k
実践! ソフトウェアエンジニアリングの価値の計測 ── Effort、Output、Outcome、Impact
nomuson
0
2.1k
タイミーのデータ活用を支えるdbt Cloud導入とこれから
ttccddtoki
1
230
生成AIのビジネス活用
seosoft
0
110
あなたの人生も変わるかも?AWS認定2つで始まったウソみたいな話
iwamot
3
860
Featured
See All Featured
Git: the NoSQL Database
bkeepers
PRO
427
64k
Responsive Adventures: Dirty Tricks From The Dark Corners of Front-End
smashingmag
251
21k
We Have a Design System, Now What?
morganepeng
51
7.3k
Stop Working from a Prison Cell
hatefulcrawdad
267
20k
Principles of Awesome APIs and How to Build Them.
keavy
126
17k
Become a Pro
speakerdeck
PRO
26
5.1k
jQuery: Nuts, Bolts and Bling
dougneiner
62
7.6k
KATA
mclloyd
29
14k
GraphQLの誤解/rethinking-graphql
sonatard
68
10k
Intergalactic Javascript Robots from Outer Space
tanoku
270
27k
YesSQL, Process and Tooling at Scale
rocio
170
14k
GitHub's CSS Performance
jonrohan
1030
460k
Transcript
Evolution of e-commerce search @ shopping24 Search Technology Meetup Hamburg
Torsten Bøgh Köster (Shopping24) 19. November 2014
Agenda Why search? Motivation & introduction Evolutionary steps taken Advanced
steps Pitfalls
@tboeghk ‣CTO shopping24 internet group ‣University of Hamburg, class of
2005 ‣Likes: search, build, delivery, code quality, road bike
None
Open Source Power. Delivered.
search system architecture overview
Fun fact: <1% visitors actually use the search bar.
Search enables automatic SEA scaling. But what about navigating afterwards?
Agenda Why search? Motivation & introduction Evolutionary steps taken Advanced
steps Pitfalls
Don’t get me started on tokenizing. Move expensive operations (synonyms,
stemming) to index time
German stemming: „Ein_ Geschicht_ voll__ Missverständniss_“: Refrain from Porter and
Snowball stemmer.
Extend recall using synonyms & subtopics, use edismax query parser
with boost terms for high precision. Consider reranking to penalize documents
3 approaches to navigating search results
use facetting to narrow a search result, use adaptive tree
structures
the direct spellchecker in Solr does a great job. Consider
word break. Avoid dictionaries, handle special cases using synonyms (+ custom code).
Use Solrs more like this. Supply terms in mlt request.
Works on >1 documents as well. Filter on gender (and categories).
remove terms from query and retry when hitting zero results.
Uses spellchecker & custom collators
Recycle Solr spellchecker infrastructure to retrieve related brands, categories &
searches.
Agenda Why search? Motivation & introduction Evolutionary steps taken Advanced
steps Pitfalls
TF/IDF ranking does not work for e-commerce search. Consider the
bmax query parser.
first impression matters: use solr grouping and expand to „fold“
similar products.
Separate data & ranking information. Retrieve ranking information from an
external data store (ExternalFileFieldType, RedisFieldType). Use boost functions to mix information retrieved. per document lookup
Agenda Why search? Motivation & introduction Evolutionary steps taken Advanced
steps Pitfalls
Visualize results for the target audience. Separate business from technical
views.
Custom code in Solr is failure by design. You will
inevitably hit garbage collection hell. GC will happen, deal with it.
Ultimate solution: issue replication slots to slaves. Perform Full GC
after cache warming.
Find us on github.com
Questions? @tboeghk developer.s24.com
[email protected]