Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Understanding the Basics of Data Analysis
Search
Piyush Verma
November 13, 2017
Technology
0
310
Understanding the Basics of Data Analysis
Read more on
http://blog.oogway.in
Piyush Verma
November 13, 2017
Tweet
Share
More Decks by Piyush Verma
See All by Piyush Verma
SLOs that Lie
meson10
0
92
Doing SRE the right way - 2
meson10
0
160
Doing SRE the right way
meson10
0
970
Observability and Control Theory
meson10
1
1k
Reliability
meson10
0
130
Reliability of Distributed Systems
meson10
0
230
My TLS was broken
meson10
0
110
Technology that builds Organizations
meson10
0
110
Namespace.go
meson10
0
130
Other Decks in Technology
See All in Technology
Windows 11 で AWS Documentation MCP Server 接続実践/practical-aws-documentation-mcp-server-connection-on-windows-11
emiki
0
960
本が全く読めなかった過去の自分へ
genshun9
0
290
2年でここまで成長!AWSで育てたAI Slack botの軌跡
iwamot
PRO
4
700
Witchcraft for Memory
pocke
1
310
Tech-Verse 2025 Keynote
lycorptech_jp
PRO
0
100
なぜ私はいま、ここにいるのか? #もがく中堅デザイナー #プロダクトデザイナー
bengo4com
0
410
5min GuardDuty Extended Threat Detection EKS
takakuni
0
140
Amazon ECS & AWS Fargate 運用アーキテクチャ2025 / Amazon ECS and AWS Fargate Ops Architecture 2025
iselegant
16
5.5k
Understanding_Thread_Tuning_for_Inference_Servers_of_Deep_Models.pdf
lycorptech_jp
PRO
0
120
MySQL5.6から8.4へ 戦いの記録
kyoshidaxx
1
210
Liquid Glass革新とSwiftUI/UIKit進化
fumiyasac0921
0
210
変化する開発、進化する体系時代に適応するソフトウェアエンジニアの知識と考え方(JaSST'25 Kansai)
mizunori
1
210
Featured
See All Featured
jQuery: Nuts, Bolts and Bling
dougneiner
63
7.8k
Facilitating Awesome Meetings
lara
54
6.4k
Testing 201, or: Great Expectations
jmmastey
42
7.5k
A designer walks into a library…
pauljervisheath
207
24k
Reflections from 52 weeks, 52 projects
jeffersonlam
351
20k
The Power of CSS Pseudo Elements
geoffreycrofte
77
5.8k
Optimizing for Happiness
mojombo
379
70k
Unsuck your backbone
ammeep
671
58k
Helping Users Find Their Own Way: Creating Modern Search Experiences
danielanewman
29
2.7k
The Art of Delivering Value - GDevCon NA Keynote
reverentgeek
15
1.5k
Practical Tips for Bootstrapping Information Extraction Pipelines
honnibal
PRO
20
1.3k
Six Lessons from altMBA
skipperchong
28
3.8k
Transcript
ABC of Distributed Data Processing. Achieving Buzzword Compliance. 1 Piyush
Verma Oogway Consulting
Common thoughts 2
When will my Data become Big Data?
Hive Data Will Save.
How did we reach here? 5
Data :: Business
Data :: Business
Types of Workload
When do I call it Big Enough?
Why bother with Data Engineering? 10
Why do analysis at all?
Descriptive - Historical. - Deterministic. - Inferential. - Managers make
pretty graphs.
Predictive - Future. - Probabilistic. - Based on Descriptive. -
This is what armchair critics do.
Prescriptive
Architecture: Round 1 15
What does data look like?
Storage Choice 1
Storage Choice 2
Challenges: Round 1 19
Scaling
Archival Policy
Oh no
Garbage / Purging
All related entities end up in complex joins
All Relationships complicate over Dimension of time
Anatomy 26
Anatomy
Challenges: Round 2 28
Snowflake Schema
Star Schema
De-Duplication
Bloom Filters Cuckoo Filters - Does not exist for sure.
- May or may not exist.
Slow Changing Dimensions
Batching vs Streaming
Out-of-Order Processing
Cubes • Efficiency of Retrieval • Warehouse:Cube :: DB:Table •
View: Dimension + Measure • Slice, Dice & Rotate
Architecture: Revisited 37
Sample Solution
Thank you! Piyush Verma @meson10 Oogway Consulting http://oogway.in