$30 off During Our Annual Pro Sale. View Details »
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Understanding the Basics of Data Analysis
Search
Piyush Verma
November 13, 2017
Technology
0
330
Understanding the Basics of Data Analysis
Read more on
http://blog.oogway.in
Piyush Verma
November 13, 2017
Tweet
Share
More Decks by Piyush Verma
See All by Piyush Verma
SLOs that Lie
meson10
0
110
Doing SRE the right way - 2
meson10
0
170
Doing SRE the right way
meson10
0
1k
Observability and Control Theory
meson10
1
1.1k
Reliability
meson10
0
150
Reliability of Distributed Systems
meson10
0
260
My TLS was broken
meson10
0
130
Technology that builds Organizations
meson10
0
130
Namespace.go
meson10
0
150
Other Decks in Technology
See All in Technology
今年のデータ・ML系アップデートと気になるアプデのご紹介
nayuts
1
340
RAG/Agent開発のアップデートまとめ
taka0709
0
180
MLflowダイエット大作戦
lycorptech_jp
PRO
1
120
第4回 「メタデータ通り」 リアル開催
datayokocho
0
130
年間40件以上の登壇を続けて見えた「本当の発信力」/ 20251213 Masaki Okuda
shift_evolve
PRO
1
130
Kubernetes Multi-tenancy: Principles and Practices for Large Scale Internal Platforms
hhiroshell
0
120
AIプラットフォームにおけるMLflowの利用について
lycorptech_jp
PRO
1
140
re:Inventで気になったサービスを10分でいけるところまでお話しします
yama3133
1
120
Lessons from Migrating to OpenSearch: Shard Design, Log Ingestion, and UI Decisions
sansantech
PRO
1
130
.NET 10の概要
tomokusaba
0
110
プロンプトやエージェントを自動的に作る方法
shibuiwilliam
6
5.1k
コンテキスト情報を活用し個社最適化されたAI Agentを実現する4つのポイント
kworkdev
PRO
0
1.2k
Featured
See All Featured
10 Git Anti Patterns You Should be Aware of
lemiorhan
PRO
659
61k
Building a Modern Day E-commerce SEO Strategy
aleyda
45
8.3k
CSS Pre-Processors: Stylus, Less & Sass
bermonpainter
359
30k
Faster Mobile Websites
deanohume
310
31k
Side Projects
sachag
455
43k
Imperfection Machines: The Place of Print at Facebook
scottboms
269
13k
"I'm Feeling Lucky" - Building Great Search Experiences for Today's Users (#IAC19)
danielanewman
231
22k
Connecting the Dots Between Site Speed, User Experience & Your Business [WebExpo 2025]
tammyeverts
10
730
Six Lessons from altMBA
skipperchong
29
4.1k
How to train your dragon (web standard)
notwaldorf
97
6.4k
Measuring & Analyzing Core Web Vitals
bluesmoon
9
710
Distributed Sagas: A Protocol for Coordinating Microservices
caitiem20
333
22k
Transcript
ABC of Distributed Data Processing. Achieving Buzzword Compliance. 1 Piyush
Verma Oogway Consulting
Common thoughts 2
When will my Data become Big Data?
Hive Data Will Save.
How did we reach here? 5
Data :: Business
Data :: Business
Types of Workload
When do I call it Big Enough?
Why bother with Data Engineering? 10
Why do analysis at all?
Descriptive - Historical. - Deterministic. - Inferential. - Managers make
pretty graphs.
Predictive - Future. - Probabilistic. - Based on Descriptive. -
This is what armchair critics do.
Prescriptive
Architecture: Round 1 15
What does data look like?
Storage Choice 1
Storage Choice 2
Challenges: Round 1 19
Scaling
Archival Policy
Oh no
Garbage / Purging
All related entities end up in complex joins
All Relationships complicate over Dimension of time
Anatomy 26
Anatomy
Challenges: Round 2 28
Snowflake Schema
Star Schema
De-Duplication
Bloom Filters Cuckoo Filters - Does not exist for sure.
- May or may not exist.
Slow Changing Dimensions
Batching vs Streaming
Out-of-Order Processing
Cubes • Efficiency of Retrieval • Warehouse:Cube :: DB:Table •
View: Dimension + Measure • Slice, Dice & Rotate
Architecture: Revisited 37
Sample Solution
Thank you! Piyush Verma @meson10 Oogway Consulting http://oogway.in