Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Evolution of e-commerce search @ shopping24
Search
Torsten Bøgh Köster
November 19, 2014
Technology
0
1.2k
Evolution of e-commerce search @ shopping24
Held at the first Search Technology Meetup in Hamburg on November, 19th.
Torsten Bøgh Köster
November 19, 2014
Tweet
Share
More Decks by Torsten Bøgh Köster
See All by Torsten Bøgh Köster
Taking an abandoned Solr search from zero to GenAI hero
tboeghk
0
14
Oder mache ich es lieber selbst? Wie sich Kosten und Geopolitik auf Cloud-Betrieb auswirken
tboeghk
0
34
🔪 How we cut our AWS costs in half
tboeghk
0
250
Shared Nothing Logging Infrastructure
tboeghk
0
110
Beyond Cloud: A road trip into AWS and back to bare metal
tboeghk
1
100
Shared Nothing Logging Infrastructure
tboeghk
0
1.3k
Kubernetes the ❤️ way
tboeghk
0
1k
Beyond Cloud: A road trip into AWS and back to bare metal
tboeghk
0
100
Open-Source-Logging und -Monitoring (W-JAX 2017)
tboeghk
0
97
Other Decks in Technology
See All in Technology
マーケットプレイス版Oracle WebCenter Content For OCI
oracle4engineer
PRO
3
990
Delta airlines Customer®️ USA Contact Numbers: Complete 2025 Support Guide
deltahelp
0
1.1k
インフラ寄りSREの生存戦略
sansantech
PRO
9
3.4k
How Do I Contact HP Printer Support? [Full 2025 Guide for U.S. Businesses]
harrry1211
0
130
american airlines®️ USA Contact Numbers: Complete 2025 Support Guide
supportflight
1
120
Copilot coding agentにベットしたいCTOが開発組織で取り組んだこと / GitHub Copilot coding agent in Team
tnir
0
150
話題の MCP と巡る OCI RAG ソリューションの旅 - Select AI with RAG と Generative AI Agents ディープダイブ
oracle4engineer
PRO
5
110
オフィスビルを監視しよう:フィジカル×デジタルにまたがるSLI/SLO設計と運用の難しさ / Monitoring Office Buildings: The Challenge of Physical-Digital SLI/SLO Design & Operation
bitkey
1
350
AWS CDKの仕組み / how-aws-cdk-works
gotok365
10
890
QuickSight SPICE の効果的な運用戦略~S3 + Athena 構成での実践ノウハウ~/quicksight-spice-s3-athena-best-practices
emiki
0
260
マルチプロダクト環境におけるSREの役割 / SRE NEXT 2025 lunch session
sugamasao
1
400
United™️ Airlines®️ Customer®️ USA Contact Numbers: Complete 2025 Support Guide
flyunitedguide
0
780
Featured
See All Featured
Code Reviewing Like a Champion
maltzj
524
40k
XXLCSS - How to scale CSS and keep your sanity
sugarenia
248
1.3M
Refactoring Trust on Your Teams (GOTO; Chicago 2020)
rmw
34
3.1k
Raft: Consensus for Rubyists
vanstee
140
7k
The World Runs on Bad Software
bkeepers
PRO
69
11k
It's Worth the Effort
3n
185
28k
How To Stay Up To Date on Web Technology
chriscoyier
790
250k
Git: the NoSQL Database
bkeepers
PRO
430
65k
BBQ
matthewcrist
89
9.7k
Build The Right Thing And Hit Your Dates
maggiecrowley
37
2.8k
Optimizing for Happiness
mojombo
379
70k
[Rails World 2023 - Day 1 Closing Keynote] - The Magic of Rails
eileencodes
35
2.4k
Transcript
Evolution of e-commerce search @ shopping24 Search Technology Meetup Hamburg
Torsten Bøgh Köster (Shopping24) 19. November 2014
Agenda Why search? Motivation & introduction Evolutionary steps taken Advanced
steps Pitfalls
@tboeghk ‣CTO shopping24 internet group ‣University of Hamburg, class of
2005 ‣Likes: search, build, delivery, code quality, road bike
None
Open Source Power. Delivered.
search system architecture overview
Fun fact: <1% visitors actually use the search bar.
Search enables automatic SEA scaling. But what about navigating afterwards?
Agenda Why search? Motivation & introduction Evolutionary steps taken Advanced
steps Pitfalls
Don’t get me started on tokenizing. Move expensive operations (synonyms,
stemming) to index time
German stemming: „Ein_ Geschicht_ voll__ Missverständniss_“: Refrain from Porter and
Snowball stemmer.
Extend recall using synonyms & subtopics, use edismax query parser
with boost terms for high precision. Consider reranking to penalize documents
3 approaches to navigating search results
use facetting to narrow a search result, use adaptive tree
structures
the direct spellchecker in Solr does a great job. Consider
word break. Avoid dictionaries, handle special cases using synonyms (+ custom code).
Use Solrs more like this. Supply terms in mlt request.
Works on >1 documents as well. Filter on gender (and categories).
remove terms from query and retry when hitting zero results.
Uses spellchecker & custom collators
Recycle Solr spellchecker infrastructure to retrieve related brands, categories &
searches.
Agenda Why search? Motivation & introduction Evolutionary steps taken Advanced
steps Pitfalls
TF/IDF ranking does not work for e-commerce search. Consider the
bmax query parser.
first impression matters: use solr grouping and expand to „fold“
similar products.
Separate data & ranking information. Retrieve ranking information from an
external data store (ExternalFileFieldType, RedisFieldType). Use boost functions to mix information retrieved. per document lookup
Agenda Why search? Motivation & introduction Evolutionary steps taken Advanced
steps Pitfalls
Visualize results for the target audience. Separate business from technical
views.
Custom code in Solr is failure by design. You will
inevitably hit garbage collection hell. GC will happen, deal with it.
Ultimate solution: issue replication slots to slaves. Perform Full GC
after cache warming.
Find us on github.com
Questions? @tboeghk developer.s24.com
[email protected]