$30 off During Our Annual Pro Sale. View Details »

ABC of Distributed Data Processing

ABC of Distributed Data Processing

New to Big Data world? Too many frameworks? Let's take one step back to learn what is done, why is it done and how we reached here.

Piyush Verma

July 22, 2017
Tweet

More Decks by Piyush Verma

Other Decks in Technology

Transcript

  1. ABC of Distributed Data
    Processing.
    Achieving Buzzword Compliance.
    Oogway
    Consulting

    View Slide

  2. Hive Data Will Save.
    2
    Oogway
    Consulting

    View Slide

  3. Why are we Here?
    3
    Oogway
    Consulting

    View Slide

  4. Types of Workload
    4
    Oogway
    Consulting

    View Slide

  5. Why do analysis at all?
    5
    Oogway
    Consulting

    View Slide

  6. Descriptive
    - Historical.
    - Deterministic.
    - Inferential.
    - Managers make pretty graphs.
    6
    Oogway
    Consulting

    View Slide

  7. Predictive
    - Future.
    - Probabilistic.
    - Based on Descriptive.
    - This is how pundits predict stocks.
    7
    Oogway
    Consulting

    View Slide

  8. Prescriptive
    8
    Oogway
    Consulting

    View Slide

  9. Enough. What does this data look like?
    9
    Oogway
    Consulting

    View Slide

  10. Storage Choice 1
    10
    Oogway
    Consulting

    View Slide

  11. Storage Choice 2
    11
    Oogway
    Consulting

    View Slide

  12. Star Schema
    12
    Oogway
    Consulting

    View Slide

  13. Snowflake Schema
    13
    Oogway
    Consulting

    View Slide

  14. Oh no
    14
    Oogway
    Consulting

    View Slide

  15. What are we solving?
    15
    Oogway
    Consulting

    View Slide

  16. Challenges
    16
    Oogway
    Consulting

    View Slide

  17. Scaling
    17
    Oogway
    Consulting

    View Slide

  18. Archival Policy
    18
    Oogway
    Consulting

    View Slide

  19. Garbage / Purging
    19
    Oogway
    Consulting

    View Slide

  20. All related entities end up in complex joins
    20
    Oogway
    Consulting

    View Slide

  21. All Relationships complicate over Dimension of time
    21
    Oogway
    Consulting

    View Slide

  22. Anatomy
    22
    Oogway
    Consulting

    View Slide

  23. One such solution.
    23
    Oogway
    Consulting

    View Slide

  24. Thank you.
    - Piyush Verma
    - [email protected]
    - Twitter: meson10
    24
    Oogway
    Consulting

    View Slide