Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
How to reindex 1B documents in 1 hour?
Search
Sponsored
·
Your Podcast. Everywhere. Effortlessly.
Share. Educate. Inspire. Entertain. You do you. We'll handle the rest.
→
Qaiser Abbasi
December 13, 2018
Business
800
0
Share
Embed
Copy iframe code
Copy JS code
Copy link
Start on current slide
How to reindex 1B documents in 1 hour?
Talk given at
www.meetup.com/Elasticsearch-Berlin/
Qaiser Abbasi
December 13, 2018
More Decks by Qaiser Abbasi
See All by Qaiser Abbasi
Java User Group Frankfurt – CDI BeanTesting
qabbasi
0
71
Testing-Darwinismus
qabbasi
0
76
Other Decks in Business
See All in Business
自分のハンドルを握る〜AI時代だからこそ求められるセルフマネジメントの技術/Self-Management Skills Needed More Than Ever in the AI Era
ikuodanaka
1
2.8k
株式会社ルクレ新卒向け採用ピッチ
lecre
0
420
答えがすぐ出るAI時代での自分育成論
kan
0
100
2026.6_中途採用資料.pdf
superstudio
PRO
5
110k
FIGEO採用ピッチ資料
figeohr
0
450
【エンジニア採用】BuySell Technologies会社説明資料
buyselltechnologies
3
99k
フルカイテン株式会社 採用資料
fullkaiten
0
98k
株式会社アイリッジ 会社説明資料
iridge
0
6.7k
【結果報告】Claude×Linearで会社のタスク管理をAIにまかせて1ヶ月。業務効率150%向上したが、AIネイティブカンパニーを目指すならもっと「加速への狂気」が必要
nagatsu
1
560
5年間コードを書かなかったVPoEが なぜ現場に戻ったのか?
gessy0129
1
230
Decentier_Corporate Deck_2026
decentier
PRO
0
720
Corporate Story (GA technologies Co., Ltd.)
gatechnologies
0
970
Featured
See All Featured
The Curious Case for Waylosing
cassininazir
1
400
Exploring anti-patterns in Rails
aemeredith
3
430
Joys of Absence: A Defence of Solitary Play
codingconduct
1
400
How To Stay Up To Date on Web Technology
chriscoyier
790
250k
Rails Girls Zürich Keynote
gr2m
96
14k
Documentation Writing (for coders)
carmenintech
77
5.4k
I Don’t Have Time: Getting Over the Fear to Launch Your Podcast
jcasabona
34
2.8k
svc-hook: hooking system calls on ARM64 by binary rewriting
retrage
2
310
個人開発の失敗を避けるイケてる考え方 / tips for indie hackers
panda_program
123
22k
Designing for Performance
lara
611
70k
We Have a Design System, Now What?
morganepeng
55
8.2k
Bootstrapping a Software Product
garrettdimon
PRO
307
120k
Transcript
Rene Treffer, Qaiser Abbasi How to reindex 1B documents in
1 hour?
Search @ SoundCloud
Powered by ElasticSearch
Typical search document
Clusters of 30 nodes
Clusters of 30 nodes data size * replication = 120%
* total memory
Cluster 2 Cluster 3 Cluster 1 Cluster 2 Cluster 3
Cluster 1 Multiple clusters per use-case
Problems?
Lead time of features and bugfixes Problems?
Indexing
Indexing 1. Extract
Indexing 1. Extract 2. Build ES documents
Indexing 1. Extract 2. Build ES documents 3. Load into
ES
Indexing 1. Extract 2. Build ES documents 3. Load into
ES 0. Live updates
Indexing 1. Extract 2. Build ES documents 3. Load into
ES 0. Live updates Kafka Kafka +
Kafka historic current compaction Cluster 1 Cluster 2 shipper 1
shipper 2 indexer
Kafka for ES documents 1. Enable compaction 2. Use fast
compression 3. Use enough partitions 4. Use SSDs + 10GBit
ES cluster lifecycle Reindex Live Maintenance
Reindex settings 1. Shards 2. Replication settings 3. Async Translog
4. Refresh Interval
Finish reindexing 1. Merge into one segment*** 2. Set #
replicas
Throughput ≈ 600K OP/s ≈ 30 Mins
4X faster for 95% ≈ 40ms for 50%
4X Reindexing in 1 Sprint
Summary • Solved initial problem • Enablement in daily life
Future work
Q & A
Sounds interesting? Come and talk to us!
THANK YOU
[email protected]
[email protected]