Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Refactoring a Solr based api application
Search
Sponsored
·
SiteGround - Reliable hosting with speed, security, and support you can count on.
→
Torsten Bøgh Köster
April 13, 2012
Programming
120
3
Share
Embed
Copy iframe code
Copy JS code
Copy link
Start on current slide
Refactoring a Solr based api application
Held on Apache Lucene Eurocon 2011 in Barcelona
Torsten Bøgh Köster
April 13, 2012
More Decks by Torsten Bøgh Köster
See All by Torsten Bøgh Köster
LLMs im Griff: Observability, Tracing und Security
tboeghk
0
31
LLMs im Griff: Observability, Tracing und Security
tboeghk
0
46
Oder mache ich es lieber selbst? Wie sich Kosten und Geopolitik auf Cloud-Betrieb auswirken
tboeghk
0
46
Taking an abandoned Solr search from zero to GenAI hero
tboeghk
0
52
Oder mache ich es lieber selbst? Wie sich Kosten und Geopolitik auf Cloud-Betrieb auswirken
tboeghk
0
55
🔪 How we cut our AWS costs in half
tboeghk
0
390
Shared Nothing Logging Infrastructure
tboeghk
0
130
Beyond Cloud: A road trip into AWS and back to bare metal
tboeghk
1
110
Shared Nothing Logging Infrastructure
tboeghk
0
1.4k
Other Decks in Programming
See All in Programming
Modding RubyKaigi for Myself
yui_knk
0
890
Observability in Practice:Grafana 與 Edge Device SRE 的那些事
blueswen
0
120
今さら聞けないCancellationToken
htkym
0
220
Composerを使ったサプライチェーン攻撃の様子を眺めてみる #phpstudy
o0h
PRO
2
220
TypeScript+Orvalで実現する型安全かつ堅牢でスケーラブルなマルチチャネル通知基盤 / TSKaigi Night talks ~after conference~
d0riven
0
300
ローカルLLMを使ってB2Bサービスを作っていての学び
yaotti
0
150
RTSPクライアントを自作してみた話
simotin13
0
510
コンテキストの使い捨てをやめる — ビジネスルール駆動開発と miko —
ioki
0
140
JJUG CCC 2026 Spring: JSpecify で実現する Kotlin フレンドリーな Java API 設計
ternbusty
1
140
TAKTでAI駆動開発の品質を設計する
j5ik2o
6
1k
キャリア迷子上等 ─ "ない道"は自分で作ればいい
16bitidol
3
1.7k
「エンジニアインターン、どうやって取った?」準備のリアルを語るLT会 Progate BAR
akiomatic
0
120
Featured
See All Featured
A Soul's Torment
seathinner
6
2.9k
Discover your Explorer Soul
emna__ayadi
2
1.1k
We Have a Design System, Now What?
morganepeng
55
8.2k
Leveraging Curiosity to Care for An Aging Population
cassininazir
1
260
The Illustrated Children's Guide to Kubernetes
chrisshort
51
52k
Claude Code どこまでも/ Claude Code Everywhere
nwiizo
65
56k
16th Malabo Montpellier Forum Presentation
akademiya2063
PRO
0
140
Noah Learner - AI + Me: how we built a GSC Bulk Export data pipeline
techseoconnect
PRO
0
190
Redefining SEO in the New Era of Traffic Generation
szymonslowik
1
320
30 Presentation Tips
portentint
PRO
1
320
[SF Ruby Conf 2025] Rails X
palkan
2
1.1k
Automating Front-end Workflow
addyosmani
1370
210k
Transcript
Architectural lessons learned from refactoring a Solr based API application.
Torsten Bøgh Köster (Shopping24) Apache Lucene Eurocon, 19.10.2011
Contents Shopping24 and it‘s API Technical scaling solutions Sharding Caching
Solr Cores „Elastic“ infrastructure business requirements as key factor
@tboeghk Software- and systems- architect 2 years experience with Solr
3 years experience with Lucene Team of 7 Java developers currently at Shopping24
shopping24 internet group
1 portal became n portals
30 partner shops became 700
500k to 7m documents
index fact time •16 Gig Data •Single-Core-Layout •Up to 17s
response time •Machine size limited •Stalled at solr version 1.4 •API designed for small tools
scaling goal: 15-50m documents
ask the nerds „Shard!“ That‘ll be fun! „Use spare compute
cores at Amazon?“ breathe load into the cloud „Reduce that index size“ „Get rid of those long running queries!“
data sharding ...
... is highly effective. 125ms 250ms 375ms 500ms 1 4
8 12 16 20 1shard 2shard 3shard 4shard 6shard 8shard concurrent requests
Sharding: size matters the bigger your index gets, the more
complex your queries are, the more concurrent requests, the more sharding you need
but wait ...
Why do we have such a big index?
7m documents vs. 2m active poducts
fashion product lifecycle meets SEO Bastografie / photocase.com
Separation of duties! Remove unsearchable data from your index.
Why do we have complex queries?
A Solr index designed for 1 portal
Grown into a multi-portal index
Let “sharding“ follow your data ...
... and build separate cores for every client.
Duplicate data as long as access is fast. andybahn /
photocase.com
Streamline your index provisioning process.
A thousand splendid cores at your fingertips.
Throwing hardware at problems. Automated.
evil traps: latency, $$
mirror your complete system – solve load balancer problems froodmat
/ photocase.com
I said faster!
use a cache layer like Varnish.
What about those complex queries? Why do we have them?
And how do we get rid of them?
Lost in encapsulation: Solr API exposed to world.
What‘s the key factor?
look at your business requirements
decrease complexity
Questions? Comments? Ideas? Twitter: @tboeghk Github: @tboeghk Email:
[email protected]
Web:
http://www.s24.com Images: sxc.hu (unless noted otherwise)