Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
How to reindex 1B documents in 1 hour?
Search
Sponsored
·
Ship Features Fearlessly
Turn features on and off without deploys. Used by thousands of Ruby developers.
→
Qaiser Abbasi
December 13, 2018
Business
0
770
How to reindex 1B documents in 1 hour?
Talk given at
www.meetup.com/Elasticsearch-Berlin/
Qaiser Abbasi
December 13, 2018
Tweet
Share
More Decks by Qaiser Abbasi
See All by Qaiser Abbasi
Java User Group Frankfurt – CDI BeanTesting
qabbasi
0
65
Testing-Darwinismus
qabbasi
0
62
Other Decks in Business
See All in Business
RDRAモデルからFP・工数・金額につなぐ定量見積り
bpstudy
1
210
Nulab Fun Deck 〜チームワークが、世界をもっと『おもしろく』する〜
nulabinc
PRO
1
2.8k
セーフィー株式会社(Safie Inc.) 会社紹介資料
safie_recruit
7
410k
Eco-Pork Impact Report 2026.02.09 EN
ecopork
0
280
ネクストビート 新卒向け会社紹介資料
nextbeat
1
520
株式会社CINC 会社案内/Company introduction
cinchr
6
74k
RDRAで価値を可視化する
kanzaki
2
370
MEEM_Company_Deck202512.pdf
info_meem
0
3.9k
ノッカリアドベントカレンダー全記録まとめ
szkm555
0
130
-生きる-AI時代におけるライターの生存戦略
mimuhayashi
0
240
Mercari-Fact-book_en
mercari_inc
2
32k
NewsPicks Expert説明資料 / NewsPicks Expert Introduction
mimir
0
22k
Featured
See All Featured
What does AI have to do with Human Rights?
axbom
PRO
0
2k
Skip the Path - Find Your Career Trail
mkilby
0
59
Six Lessons from altMBA
skipperchong
29
4.2k
GraphQLの誤解/rethinking-graphql
sonatard
74
11k
How to build a perfect <img>
jonoalderson
1
4.9k
ラッコキーワード サービス紹介資料
rakko
1
2.3M
"I'm Feeling Lucky" - Building Great Search Experiences for Today's Users (#IAC19)
danielanewman
231
22k
Fantastic passwords and where to find them - at NoRuKo
philnash
52
3.6k
Design and Strategy: How to Deal with People Who Don’t "Get" Design
morganepeng
133
19k
Un-Boring Meetings
codingconduct
0
200
16th Malabo Montpellier Forum Presentation
akademiya2063
PRO
0
52
Stop Working from a Prison Cell
hatefulcrawdad
273
21k
Transcript
Rene Treffer, Qaiser Abbasi How to reindex 1B documents in
1 hour?
Search @ SoundCloud
Powered by ElasticSearch
Typical search document
Clusters of 30 nodes
Clusters of 30 nodes data size * replication = 120%
* total memory
Cluster 2 Cluster 3 Cluster 1 Cluster 2 Cluster 3
Cluster 1 Multiple clusters per use-case
Problems?
Lead time of features and bugfixes Problems?
Indexing
Indexing 1. Extract
Indexing 1. Extract 2. Build ES documents
Indexing 1. Extract 2. Build ES documents 3. Load into
ES
Indexing 1. Extract 2. Build ES documents 3. Load into
ES 0. Live updates
Indexing 1. Extract 2. Build ES documents 3. Load into
ES 0. Live updates Kafka Kafka +
Kafka historic current compaction Cluster 1 Cluster 2 shipper 1
shipper 2 indexer
Kafka for ES documents 1. Enable compaction 2. Use fast
compression 3. Use enough partitions 4. Use SSDs + 10GBit
ES cluster lifecycle Reindex Live Maintenance
Reindex settings 1. Shards 2. Replication settings 3. Async Translog
4. Refresh Interval
Finish reindexing 1. Merge into one segment*** 2. Set #
replicas
Throughput ≈ 600K OP/s ≈ 30 Mins
4X faster for 95% ≈ 40ms for 50%
4X Reindexing in 1 Sprint
Summary • Solved initial problem • Enablement in daily life
Future work
Q & A
Sounds interesting? Come and talk to us!
THANK YOU
[email protected]
[email protected]