Slide 1

Slide 1 text

ABC of Distributed Data Processing. Achieving Buzzword Compliance. Oogway Consulting

Slide 2

Slide 2 text

Hive Data Will Save. 2 Oogway Consulting

Slide 3

Slide 3 text

Why are we Here? 3 Oogway Consulting

Slide 4

Slide 4 text

Types of Workload 4 Oogway Consulting

Slide 5

Slide 5 text

Why do analysis at all? 5 Oogway Consulting

Slide 6

Slide 6 text

Descriptive - Historical. - Deterministic. - Inferential. - Managers make pretty graphs. 6 Oogway Consulting

Slide 7

Slide 7 text

Predictive - Future. - Probabilistic. - Based on Descriptive. - This is how pundits predict stocks. 7 Oogway Consulting

Slide 8

Slide 8 text

Prescriptive 8 Oogway Consulting

Slide 9

Slide 9 text

Enough. What does this data look like? 9 Oogway Consulting

Slide 10

Slide 10 text

Storage Choice 1 10 Oogway Consulting

Slide 11

Slide 11 text

Storage Choice 2 11 Oogway Consulting

Slide 12

Slide 12 text

Star Schema 12 Oogway Consulting

Slide 13

Slide 13 text

Snowflake Schema 13 Oogway Consulting

Slide 14

Slide 14 text

Oh no 14 Oogway Consulting

Slide 15

Slide 15 text

What are we solving? 15 Oogway Consulting

Slide 16

Slide 16 text

Challenges 16 Oogway Consulting

Slide 17

Slide 17 text

Scaling 17 Oogway Consulting

Slide 18

Slide 18 text

Archival Policy 18 Oogway Consulting

Slide 19

Slide 19 text

Garbage / Purging 19 Oogway Consulting

Slide 20

Slide 20 text

All related entities end up in complex joins 20 Oogway Consulting

Slide 21

Slide 21 text

All Relationships complicate over Dimension of time 21 Oogway Consulting

Slide 22

Slide 22 text

Anatomy 22 Oogway Consulting

Slide 23

Slide 23 text

One such solution. 23 Oogway Consulting

Slide 24

Slide 24 text

Thank you. - Piyush Verma - [email protected] - Twitter: meson10 24 Oogway Consulting