Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
How to reindex 1B documents in 1 hour?
Search
Qaiser Abbasi
December 13, 2018
Business
0
750
How to reindex 1B documents in 1 hour?
Talk given at
www.meetup.com/Elasticsearch-Berlin/
Qaiser Abbasi
December 13, 2018
Tweet
Share
More Decks by Qaiser Abbasi
See All by Qaiser Abbasi
Java User Group Frankfurt – CDI BeanTesting
qabbasi
0
63
Testing-Darwinismus
qabbasi
0
58
Other Decks in Business
See All in Business
略歴 (2025年6月27日)
tsogo817421
2
350
Brief Profile (June 27, 2025)
tsogo817421
2
330
Introduction of Elastic Infra Inc.
elasticinfra
0
720
特別講義 理系のための法学入門
seko_shuhei
2
2.4k
事業計画及び成長可能性に関する事項 2025年6月25日
cynd
0
730
フルリモートで社内にどうやって自分の居場所を作るのか?
satoshi256kbyte
10
17k
Management Workflow
dskst
2
360
Sapeet Recruiting
sapeet
0
3.2k
第9回 情シス転職ミートアップ - わたしのミッションとLayerXに決めた理由
shimosyan
0
450
KINTOテクノロジーズ OsakaTechLab説明資料 / ktc-osakatechlab-introduction.pdf
ktc_creative
0
510
大AI時代を長く活躍するための 「コンフォート・ゾーン」の新解釈
mkitahara01985
0
1k
Expedi𝓪®️ USA Contact Numbers: Complete 2-0-2-5 Support Guide
travelhupsupport
0
100
Featured
See All Featured
Designing Dashboards & Data Visualisations in Web Apps
destraynor
231
53k
Build your cross-platform service in a week with App Engine
jlugia
231
18k
Scaling GitHub
holman
460
140k
Principles of Awesome APIs and How to Build Them.
keavy
126
17k
ReactJS: Keep Simple. Everything can be a component!
pedronauck
667
120k
RailsConf & Balkan Ruby 2019: The Past, Present, and Future of Rails at GitHub
eileencodes
138
34k
How to train your dragon (web standard)
notwaldorf
96
6.1k
Easily Structure & Communicate Ideas using Wireframe
afnizarnur
194
16k
Building Better People: How to give real-time feedback that sticks.
wjessup
367
19k
The Success of Rails: Ensuring Growth for the Next 100 Years
eileencodes
45
7.5k
Rebuilding a faster, lazier Slack
samanthasiow
83
9.1k
"I'm Feeling Lucky" - Building Great Search Experiences for Today's Users (#IAC19)
danielanewman
229
22k
Transcript
Rene Treffer, Qaiser Abbasi How to reindex 1B documents in
1 hour?
Search @ SoundCloud
Powered by ElasticSearch
Typical search document
Clusters of 30 nodes
Clusters of 30 nodes data size * replication = 120%
* total memory
Cluster 2 Cluster 3 Cluster 1 Cluster 2 Cluster 3
Cluster 1 Multiple clusters per use-case
Problems?
Lead time of features and bugfixes Problems?
Indexing
Indexing 1. Extract
Indexing 1. Extract 2. Build ES documents
Indexing 1. Extract 2. Build ES documents 3. Load into
ES
Indexing 1. Extract 2. Build ES documents 3. Load into
ES 0. Live updates
Indexing 1. Extract 2. Build ES documents 3. Load into
ES 0. Live updates Kafka Kafka +
Kafka historic current compaction Cluster 1 Cluster 2 shipper 1
shipper 2 indexer
Kafka for ES documents 1. Enable compaction 2. Use fast
compression 3. Use enough partitions 4. Use SSDs + 10GBit
ES cluster lifecycle Reindex Live Maintenance
Reindex settings 1. Shards 2. Replication settings 3. Async Translog
4. Refresh Interval
Finish reindexing 1. Merge into one segment*** 2. Set #
replicas
Throughput ≈ 600K OP/s ≈ 30 Mins
4X faster for 95% ≈ 40ms for 50%
4X Reindexing in 1 Sprint
Summary • Solved initial problem • Enablement in daily life
Future work
Q & A
Sounds interesting? Come and talk to us!
THANK YOU
[email protected]
[email protected]