Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Szymon Sobczak - Hadoop + Storm
Search
Base Lab
May 07, 2015
Technology
120
0
Share
Szymon Sobczak - Hadoop + Storm
Combo for realtime big data systems
Base Lab
May 07, 2015
More Decks by Base Lab
See All by Base Lab
Slawek Skowron - Monitoring @ Scale
baselab
0
150
Karol Nowak - Monitoring clock drift in Amazon EC2 environment
baselab
0
130
Tomasz Nowak - Web Application Testing made easy
baselab
0
310
Szymon Pawlik - UX i Automatyzacja czyli jak testerzy mogą poprawić produkt.
baselab
0
260
Mateusz Herych - LIKE '%smth%' is not the way
baselab
0
160
Jerzy Chałupski - Offline mode in Android apps
baselab
3
490
Jerzy Chałupski - Data model on Android
baselab
4
250
Other Decks in Technology
See All in Technology
AI Adaptable なテストを整える工夫 / Ways to Make Your Tests AI-Adaptable
bitkey
PRO
2
160
oracle-to-databricks-migration-with-llm-and-dbt
casek
1
360
Platform Engineering as a Product: Criteria for Improvement and Multi-Tenant Design
kumorn5s
0
220
Datadog 認定試験の概要と対策
uechishingo
0
190
JJUG CCC 2026 Spring AI時代の開発こそ標準化を武器に! ― 方式・プロセス・プラットフォームの標準化
s27watanabe
2
610
Gradle×GitHub_ActionsでCI時間を約50%短縮 ジョブ分割の設計と落とし穴 / Cutting CI Time by ~50% with Gradle and GitHub Actions: Job-Splitting Design and Pitfalls
takatty
0
520
Kiro CLI v2.0.0がやってきた!
kentapapa
0
210
コードレビューを制するチームがソフトウェアデリバリーのフローを制す / Beyond Code Review: Distributing Its Responsibilities Across the SDLC
mtx2s
1
370
イベントで大活躍する電子ペーパー名札 〜その3〜 / ビジュアルプログラミングIoTLT vol.23
you
PRO
0
160
AI-DLCを活用した高品質・安全なAI駆動開発実践 / AI Driven Development
yoshidashingo
1
240
組織の中で自分を経営する技術
shoota
0
220
大規模災害時でも高い信頼性を維持するアプリケーション基盤の実現/nikkei-tech-talk46
nikkei_engineer_recruiting
0
120
Featured
See All Featured
The #1 spot is gone: here's how to win anyway
tamaranovitovic
2
1.1k
Noah Learner - AI + Me: how we built a GSC Bulk Export data pipeline
techseoconnect
PRO
0
190
The Art of Programming - Codeland 2020
erikaheidi
57
14k
Evolving SEO for Evolving Search Engines
ryanjones
0
210
Navigating the moral maze — ethical principles for Al-driven product design
skipperchong
2
370
Context Engineering - Making Every Token Count
addyosmani
9
920
What's in a price? How to price your products and services
michaelherold
247
13k
What the history of the web can teach us about the future of AI
inesmontani
PRO
1
590
Building Better People: How to give real-time feedback that sticks.
wjessup
370
20k
Game over? The fight for quality and originality in the time of robots
wayneb77
1
180
Being A Developer After 40
akosma
91
590k
Designing Powerful Visuals for Engaging Learning
tmiket
1
380
Transcript
Szymon Sobczak
Hadoop + Storm Combo for realtime big data systems
Plan • Hadoop & Storm • Our setup • What
projects are we running • Decisions we had to make
Hadoop
None
Storm “Ala ma kota Artura" “ala”, “ma”, “kota”, “artura" “ala”,
“ma”, “kota”, “artura" a: 2 k: 1 m: 1 “ala”
Storm “Ala ma kota Artura" “ma” “ala”, “ma”, “kota”, “artura"
“ala” a: 2 m: 1 k: 1 “ala”, “artura" “kota”
Common traits
None
Base infrastructure services
Understand how the entire Base system works services
Big Data S3 uploader
Four example projects • Debugging • Reporting • Email intelligence
• Forecasting
Debugging S3 uploader
Reporting
Email analysis S3 uploader
Forecasting S3 uploader
Decisions we made ☑ Collect *all* data ☑ Put them
in one place ☐ Build platform for engineers ☐ Same code on Hadoop and Storm
Summary S3 uploader
Questions?
Thank you
[email protected]
bigdata.getbase.com