Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Refactoring a Solr based api application
Search
Torsten Bøgh Köster
April 13, 2012
Programming
110
3
Share
Refactoring a Solr based api application
Held on Apache Lucene Eurocon 2011 in Barcelona
Torsten Bøgh Köster
April 13, 2012
More Decks by Torsten Bøgh Köster
See All by Torsten Bøgh Köster
LLMs im Griff: Observability, Tracing und Security
tboeghk
0
21
Oder mache ich es lieber selbst? Wie sich Kosten und Geopolitik auf Cloud-Betrieb auswirken
tboeghk
0
13
Taking an abandoned Solr search from zero to GenAI hero
tboeghk
0
41
Oder mache ich es lieber selbst? Wie sich Kosten und Geopolitik auf Cloud-Betrieb auswirken
tboeghk
0
44
🔪 How we cut our AWS costs in half
tboeghk
0
360
Shared Nothing Logging Infrastructure
tboeghk
0
130
Beyond Cloud: A road trip into AWS and back to bare metal
tboeghk
1
110
Shared Nothing Logging Infrastructure
tboeghk
0
1.4k
Kubernetes the ❤️ way
tboeghk
0
1.1k
Other Decks in Programming
See All in Programming
年間50登壇、単著出版、雑誌寄稿、Podcast出演、YouTube、CM、カンファレンス主催……全部やってみたので面白さ等を比較してみよう / I’ve tried them all, so let’s compare how interesting they are.
nrslib
4
630
RailsのValidatesをSwift Macrosで再現してみた
hokuron
0
150
飯MCP
yusukebe
0
450
Reactive ❤️ Loom: A Forbidden Love Story
franz1981
2
210
Claude Code Skill入門
mayahoney
0
460
夢の無限スパゲッティ製造機 -実装篇- #phpstudy
o0h
PRO
0
180
へんな働き方
yusukebe
6
2.9k
Linux Kernelの1文字のミスで 権限昇格ができた話
rqda
0
2.2k
Codex CLIのSubagentsによる並列API実装 / Parallel API Implementation with Codex CLI Subagents
takatty
2
750
Geminiをパートナーに神社DXシステムを個人開発した話(いなめぐDX 開発振り返り)
fujiba
0
130
Symfonyの特性(設計思想)を手軽に活かす特性(trait)
ickx
0
110
Kubernetesでセルフホストが簡単なNewSQLを求めて / Seeking a NewSQL Database That's Simple to Self-Host on Kubernetes
nnaka2992
0
190
Featured
See All Featured
The Power of CSS Pseudo Elements
geoffreycrofte
82
6.2k
職位にかかわらず全員がリーダーシップを発揮するチーム作り / Building a team where everyone can demonstrate leadership regardless of position
madoxten
62
53k
Highjacked: Video Game Concept Design
rkendrick25
PRO
1
340
Writing Fast Ruby
sferik
630
63k
How to Talk to Developers About Accessibility
jct
2
170
Principles of Awesome APIs and How to Build Them.
keavy
128
17k
Intergalactic Javascript Robots from Outer Space
tanoku
273
27k
Typedesign – Prime Four
hannesfritz
42
3k
GitHub's CSS Performance
jonrohan
1032
470k
How to optimise 3,500 product descriptions for ecommerce in one day using ChatGPT
katarinadahlin
PRO
1
3.5k
Test your architecture with Archunit
thirion
1
2.2k
Skip the Path - Find Your Career Trail
mkilby
1
93
Transcript
Architectural lessons learned from refactoring a Solr based API application.
Torsten Bøgh Köster (Shopping24) Apache Lucene Eurocon, 19.10.2011
Contents Shopping24 and it‘s API Technical scaling solutions Sharding Caching
Solr Cores „Elastic“ infrastructure business requirements as key factor
@tboeghk Software- and systems- architect 2 years experience with Solr
3 years experience with Lucene Team of 7 Java developers currently at Shopping24
shopping24 internet group
1 portal became n portals
30 partner shops became 700
500k to 7m documents
index fact time •16 Gig Data •Single-Core-Layout •Up to 17s
response time •Machine size limited •Stalled at solr version 1.4 •API designed for small tools
scaling goal: 15-50m documents
ask the nerds „Shard!“ That‘ll be fun! „Use spare compute
cores at Amazon?“ breathe load into the cloud „Reduce that index size“ „Get rid of those long running queries!“
data sharding ...
... is highly effective. 125ms 250ms 375ms 500ms 1 4
8 12 16 20 1shard 2shard 3shard 4shard 6shard 8shard concurrent requests
Sharding: size matters the bigger your index gets, the more
complex your queries are, the more concurrent requests, the more sharding you need
but wait ...
Why do we have such a big index?
7m documents vs. 2m active poducts
fashion product lifecycle meets SEO Bastografie / photocase.com
Separation of duties! Remove unsearchable data from your index.
Why do we have complex queries?
A Solr index designed for 1 portal
Grown into a multi-portal index
Let “sharding“ follow your data ...
... and build separate cores for every client.
Duplicate data as long as access is fast. andybahn /
photocase.com
Streamline your index provisioning process.
A thousand splendid cores at your fingertips.
Throwing hardware at problems. Automated.
evil traps: latency, $$
mirror your complete system – solve load balancer problems froodmat
/ photocase.com
I said faster!
use a cache layer like Varnish.
What about those complex queries? Why do we have them?
And how do we get rid of them?
Lost in encapsulation: Solr API exposed to world.
What‘s the key factor?
look at your business requirements
decrease complexity
Questions? Comments? Ideas? Twitter: @tboeghk Github: @tboeghk Email:
[email protected]
Web:
http://www.s24.com Images: sxc.hu (unless noted otherwise)