Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Refactoring a Solr based api application
Search
Sponsored
·
Your Podcast. Everywhere. Effortlessly.
Share. Educate. Inspire. Entertain. You do you. We'll handle the rest.
→
Torsten Bøgh Köster
April 13, 2012
Programming
110
3
Share
Refactoring a Solr based api application
Held on Apache Lucene Eurocon 2011 in Barcelona
Torsten Bøgh Köster
April 13, 2012
More Decks by Torsten Bøgh Köster
See All by Torsten Bøgh Köster
LLMs im Griff: Observability, Tracing und Security
tboeghk
0
28
LLMs im Griff: Observability, Tracing und Security
tboeghk
0
40
Oder mache ich es lieber selbst? Wie sich Kosten und Geopolitik auf Cloud-Betrieb auswirken
tboeghk
0
34
Taking an abandoned Solr search from zero to GenAI hero
tboeghk
0
47
Oder mache ich es lieber selbst? Wie sich Kosten und Geopolitik auf Cloud-Betrieb auswirken
tboeghk
0
52
🔪 How we cut our AWS costs in half
tboeghk
0
380
Shared Nothing Logging Infrastructure
tboeghk
0
130
Beyond Cloud: A road trip into AWS and back to bare metal
tboeghk
1
110
Shared Nothing Logging Infrastructure
tboeghk
0
1.4k
Other Decks in Programming
See All in Programming
2026年のソフトウェア開発を考える(2026/05版) / Software Engineering Scrum Fest Niigata 2026 Edition
twada
PRO
24
13k
ローカルLLMでどこまでコードが書けるか / How much code can be written on a local LLM
kishida
2
380
How We Practice Exploratory Testing in Iterative Development( #scrumniigata ) / 反復開発の中で、探索的テストをどう実施しているか
teyamagu
PRO
3
1k
Firefoxにコントリビューションして得られた学び
ken7253
2
170
ふにゃっとしない名前の付け方 〜哲学で茹で上げる、コシのあるソフトウェア設計〜
shimomura
0
130
AgentCore Optimizationを始めよう!
licux
3
260
20260514 - build with ai 2026 - build LINE Bot with Gemini CLI
line_developers_tw
PRO
0
460
いつか誰かが、と思っていた フロントエンド刷新5年間の実践知
kiichisugihara
1
290
Agentic AI & UI: Arcitecture, HITL, Emerging Standards
manfredsteyer
PRO
0
120
TypeScriptだけでAIエージェントを作る フロント・エージェント・インフラのフルスタック実践
har1101
5
750
~ 秘伝のタレ化した『神スプシ』と戦う ~ 関数型パラダイムで壊れない仕組みへ
h0r15h0
1
120
TypeSpec で繋ぐ複数プロダクトの型安全
maroon8021
1
140
Featured
See All Featured
The Art of Delivering Value - GDevCon NA Keynote
reverentgeek
16
2k
Measuring & Analyzing Core Web Vitals
bluesmoon
9
830
Building a Modern Day E-commerce SEO Strategy
aleyda
45
9k
Ruling the World: When Life Gets Gamed
codingconduct
0
230
The Illustrated Guide to Node.js - THAT Conference 2024
reverentgeek
1
350
Code Review Best Practice
trishagee
74
20k
The Spectacular Lies of Maps
axbom
PRO
1
750
The innovator’s Mindset - Leading Through an Era of Exponential Change - McGill University 2025
jdejongh
PRO
1
170
A designer walks into a library…
pauljervisheath
211
24k
svc-hook: hooking system calls on ARM64 by binary rewriting
retrage
2
250
AI Search: Where Are We & What Can We Do About It?
aleyda
0
7.5k
Save Time (by Creating Custom Rails Generators)
garrettdimon
PRO
32
3.1k
Transcript
Architectural lessons learned from refactoring a Solr based API application.
Torsten Bøgh Köster (Shopping24) Apache Lucene Eurocon, 19.10.2011
Contents Shopping24 and it‘s API Technical scaling solutions Sharding Caching
Solr Cores „Elastic“ infrastructure business requirements as key factor
@tboeghk Software- and systems- architect 2 years experience with Solr
3 years experience with Lucene Team of 7 Java developers currently at Shopping24
shopping24 internet group
1 portal became n portals
30 partner shops became 700
500k to 7m documents
index fact time •16 Gig Data •Single-Core-Layout •Up to 17s
response time •Machine size limited •Stalled at solr version 1.4 •API designed for small tools
scaling goal: 15-50m documents
ask the nerds „Shard!“ That‘ll be fun! „Use spare compute
cores at Amazon?“ breathe load into the cloud „Reduce that index size“ „Get rid of those long running queries!“
data sharding ...
... is highly effective. 125ms 250ms 375ms 500ms 1 4
8 12 16 20 1shard 2shard 3shard 4shard 6shard 8shard concurrent requests
Sharding: size matters the bigger your index gets, the more
complex your queries are, the more concurrent requests, the more sharding you need
but wait ...
Why do we have such a big index?
7m documents vs. 2m active poducts
fashion product lifecycle meets SEO Bastografie / photocase.com
Separation of duties! Remove unsearchable data from your index.
Why do we have complex queries?
A Solr index designed for 1 portal
Grown into a multi-portal index
Let “sharding“ follow your data ...
... and build separate cores for every client.
Duplicate data as long as access is fast. andybahn /
photocase.com
Streamline your index provisioning process.
A thousand splendid cores at your fingertips.
Throwing hardware at problems. Automated.
evil traps: latency, $$
mirror your complete system – solve load balancer problems froodmat
/ photocase.com
I said faster!
use a cache layer like Varnish.
What about those complex queries? Why do we have them?
And how do we get rid of them?
Lost in encapsulation: Solr API exposed to world.
What‘s the key factor?
look at your business requirements
decrease complexity
Questions? Comments? Ideas? Twitter: @tboeghk Github: @tboeghk Email:
[email protected]
Web:
http://www.s24.com Images: sxc.hu (unless noted otherwise)