Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Evolution of e-commerce search @ shopping24
Search
Torsten Bøgh Köster
November 19, 2014
Technology
0
1.2k
Evolution of e-commerce search @ shopping24
Held at the first Search Technology Meetup in Hamburg on November, 19th.
Torsten Bøgh Köster
November 19, 2014
Tweet
Share
More Decks by Torsten Bøgh Köster
See All by Torsten Bøgh Köster
LLMs im Griff: Observability, Tracing und Security
tboeghk
0
12
Oder mache ich es lieber selbst? Wie sich Kosten und Geopolitik auf Cloud-Betrieb auswirken
tboeghk
0
11
Taking an abandoned Solr search from zero to GenAI hero
tboeghk
0
37
Oder mache ich es lieber selbst? Wie sich Kosten und Geopolitik auf Cloud-Betrieb auswirken
tboeghk
0
42
🔪 How we cut our AWS costs in half
tboeghk
0
330
Shared Nothing Logging Infrastructure
tboeghk
0
120
Beyond Cloud: A road trip into AWS and back to bare metal
tboeghk
1
110
Shared Nothing Logging Infrastructure
tboeghk
0
1.4k
Kubernetes the ❤️ way
tboeghk
0
1.1k
Other Decks in Technology
See All in Technology
生成AIを活用した音声文字起こしシステムの2つの構築パターンについて
miu_crescent
PRO
3
220
ランサムウェア対策としてのpnpm導入のススメ
ishikawa_satoru
0
220
1,000 にも届く AWS Organizations 組織のポリシー運用をちゃんとしたい、という話
kazzpapa3
0
110
10Xにおける品質保証活動の全体像と改善 #no_more_wait_for_test
nihonbuson
PRO
2
330
Amazon Bedrock Knowledge Basesチャンキング解説!
aoinoguchi
0
160
FinTech SREのAWSサービス活用/Leveraging AWS Services in FinTech SRE
maaaato
0
130
AI駆動開発を事業のコアに置く
tasukuonizawa
1
340
CDKで始めるTypeScript開発のススメ
tsukuboshi
1
520
仕様書駆動AI開発の実践: Issue→Skill→PRテンプレで 再現性を作る
knishioka
2
680
コミュニティが変えるキャリアの地平線:コロナ禍新卒入社のエンジニアがAWSコミュニティで見つけた成長の羅針盤
kentosuzuki
0
130
Cloud Runでコロプラが挑む 生成AI×ゲーム『神魔狩りのツクヨミ』の裏側
colopl
0
120
Amazon S3 Vectorsを使って資格勉強用AIエージェントを構築してみた
usanchuu
4
450
Featured
See All Featured
Git: the NoSQL Database
bkeepers
PRO
432
66k
Game over? The fight for quality and originality in the time of robots
wayneb77
1
120
The Illustrated Children's Guide to Kubernetes
chrisshort
51
51k
New Earth Scene 8
popppiees
1
1.5k
Self-Hosted WebAssembly Runtime for Runtime-Neutral Checkpoint/Restore in Edge–Cloud Continuum
chikuwait
0
330
The Pragmatic Product Professional
lauravandoore
37
7.1k
The #1 spot is gone: here's how to win anyway
tamaranovitovic
2
950
個人開発の失敗を避けるイケてる考え方 / tips for indie hackers
panda_program
122
21k
Cheating the UX When There Is Nothing More to Optimize - PixelPioneers
stephaniewalter
287
14k
AI in Enterprises - Java and Open Source to the Rescue
ivargrimstad
0
1.1k
Reflections from 52 weeks, 52 projects
jeffersonlam
356
21k
Put a Button on it: Removing Barriers to Going Fast.
kastner
60
4.2k
Transcript
Evolution of e-commerce search @ shopping24 Search Technology Meetup Hamburg
Torsten Bøgh Köster (Shopping24) 19. November 2014
Agenda Why search? Motivation & introduction Evolutionary steps taken Advanced
steps Pitfalls
@tboeghk ‣CTO shopping24 internet group ‣University of Hamburg, class of
2005 ‣Likes: search, build, delivery, code quality, road bike
None
Open Source Power. Delivered.
search system architecture overview
Fun fact: <1% visitors actually use the search bar.
Search enables automatic SEA scaling. But what about navigating afterwards?
Agenda Why search? Motivation & introduction Evolutionary steps taken Advanced
steps Pitfalls
Don’t get me started on tokenizing. Move expensive operations (synonyms,
stemming) to index time
German stemming: „Ein_ Geschicht_ voll__ Missverständniss_“: Refrain from Porter and
Snowball stemmer.
Extend recall using synonyms & subtopics, use edismax query parser
with boost terms for high precision. Consider reranking to penalize documents
3 approaches to navigating search results
use facetting to narrow a search result, use adaptive tree
structures
the direct spellchecker in Solr does a great job. Consider
word break. Avoid dictionaries, handle special cases using synonyms (+ custom code).
Use Solrs more like this. Supply terms in mlt request.
Works on >1 documents as well. Filter on gender (and categories).
remove terms from query and retry when hitting zero results.
Uses spellchecker & custom collators
Recycle Solr spellchecker infrastructure to retrieve related brands, categories &
searches.
Agenda Why search? Motivation & introduction Evolutionary steps taken Advanced
steps Pitfalls
TF/IDF ranking does not work for e-commerce search. Consider the
bmax query parser.
first impression matters: use solr grouping and expand to „fold“
similar products.
Separate data & ranking information. Retrieve ranking information from an
external data store (ExternalFileFieldType, RedisFieldType). Use boost functions to mix information retrieved. per document lookup
Agenda Why search? Motivation & introduction Evolutionary steps taken Advanced
steps Pitfalls
Visualize results for the target audience. Separate business from technical
views.
Custom code in Solr is failure by design. You will
inevitably hit garbage collection hell. GC will happen, deal with it.
Ultimate solution: issue replication slots to slaves. Perform Full GC
after cache warming.
Find us on github.com
Questions? @tboeghk developer.s24.com
[email protected]