Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Serverless Data Processing

Serverless Data Processing

8ac3c419308bae98885f5cd89ee492a0?s=128

Victor Wibisono

May 31, 2018
Tweet

More Decks by Victor Wibisono

Other Decks in Technology

Transcript

  1. Serverless Data Processing

  2. Contents • What it is and why • Architecture •

    Demo
  3. Serverless?

  4. Serverless is about focusing your efforts on what provides value

    to users.
  5. No servers?

  6. None
  7. None
  8. https://serverless.com/learn/

  9. Data processing?

  10. None
  11. None
  12. None
  13. Architecture

  14. None
  15. • Infinitely scalable • 99.9999999% data durability • Pay-per-use, no

    pre- provisioning
  16. AWS Glue • Spark-as-a-service • No cluster management • Pay-per-use,

    no pre-provisioning • Services include: data crawling, cataloging (Hive metastore)
  17. AWS Athena • Pay-per-query • No pre-provisioning • Infinitely scalable

  18. https://blog.panoply.io/an-amazonian-battle-comparing-athena-and-redshift

  19. None
  20. Learn more...

  21. https://unnik.s3.amazonaws.com/public-files/unnik-lab-guides/aws-summit-2018/datalake/unnik-aws-summit-2018-datalake-demo.html

  22. Demo