Upgrade to PRO for Only $50/Year—Limited-Time Offer! 🔥
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
How to reindex 1B documents in 1 hour?
Search
Qaiser Abbasi
December 13, 2018
Business
0
770
How to reindex 1B documents in 1 hour?
Talk given at
www.meetup.com/Elasticsearch-Berlin/
Qaiser Abbasi
December 13, 2018
Tweet
Share
More Decks by Qaiser Abbasi
See All by Qaiser Abbasi
Java User Group Frankfurt – CDI BeanTesting
qabbasi
0
65
Testing-Darwinismus
qabbasi
0
62
Other Decks in Business
See All in Business
明和不動産会社概要
prkoho
0
3.9k
業務設計のいろは
shunsuke_takeuchi
PRO
2
510
プロダクト本部カジュアル面談資料
10xinc
0
140
ログラス会社紹介資料 / Loglass Company Deck
loglass2019
12
470k
[NGA] カンパニーデック202511Ver.
ngaltd
PRO
1
450
Company Profile
katsuegu23
2
11k
「なんか好き」を設計する 情緒的価値をPMの武器にする3つのポイント
inagakikay
6
6.2k
現場とIT部門の橋渡しをして3000人の開発者を救った話 / Talk. Collaborate. Support. Lessons from Bridging Field and IT
nttcom
2
1.3k
Mercari-Fact-book_jp
mercari_inc
7
180k
malna-recruiting-pitch
malna
0
12k
フロントエンドにおける「型」の責任分解に対する1つのアプローチ
kinocoboy2
4
1.5k
株式会社TVer 会社紹介資料
techtver
PRO
9
96k
Featured
See All Featured
"I'm Feeling Lucky" - Building Great Search Experiences for Today's Users (#IAC19)
danielanewman
231
22k
Stop Working from a Prison Cell
hatefulcrawdad
273
21k
The Cost Of JavaScript in 2023
addyosmani
55
9.3k
[Rails World 2023 - Day 1 Closing Keynote] - The Magic of Rails
eileencodes
37
2.6k
Producing Creativity
orderedlist
PRO
348
40k
Evolution of real-time – Irina Nazarova, EuRuKo, 2024
irinanazarova
9
1.1k
CoffeeScript is Beautiful & I Never Want to Write Plain JavaScript Again
sstephenson
162
15k
Fight the Zombie Pattern Library - RWD Summit 2016
marcelosomers
234
17k
Save Time (by Creating Custom Rails Generators)
garrettdimon
PRO
32
1.8k
Templates, Plugins, & Blocks: Oh My! Creating the theme that thinks of everything
marktimemedia
31
2.6k
Bootstrapping a Software Product
garrettdimon
PRO
307
120k
What’s in a name? Adding method to the madness
productmarketing
PRO
24
3.8k
Transcript
Rene Treffer, Qaiser Abbasi How to reindex 1B documents in
1 hour?
Search @ SoundCloud
Powered by ElasticSearch
Typical search document
Clusters of 30 nodes
Clusters of 30 nodes data size * replication = 120%
* total memory
Cluster 2 Cluster 3 Cluster 1 Cluster 2 Cluster 3
Cluster 1 Multiple clusters per use-case
Problems?
Lead time of features and bugfixes Problems?
Indexing
Indexing 1. Extract
Indexing 1. Extract 2. Build ES documents
Indexing 1. Extract 2. Build ES documents 3. Load into
ES
Indexing 1. Extract 2. Build ES documents 3. Load into
ES 0. Live updates
Indexing 1. Extract 2. Build ES documents 3. Load into
ES 0. Live updates Kafka Kafka +
Kafka historic current compaction Cluster 1 Cluster 2 shipper 1
shipper 2 indexer
Kafka for ES documents 1. Enable compaction 2. Use fast
compression 3. Use enough partitions 4. Use SSDs + 10GBit
ES cluster lifecycle Reindex Live Maintenance
Reindex settings 1. Shards 2. Replication settings 3. Async Translog
4. Refresh Interval
Finish reindexing 1. Merge into one segment*** 2. Set #
replicas
Throughput ≈ 600K OP/s ≈ 30 Mins
4X faster for 95% ≈ 40ms for 50%
4X Reindexing in 1 Sprint
Summary • Solved initial problem • Enablement in daily life
Future work
Q & A
Sounds interesting? Come and talk to us!
THANK YOU
[email protected]
[email protected]