Upgrade to Pro — share decks privately, control downloads, hide ads and more …

An Introduction to Ahana Cloud for Presto on AWS

Ahana
April 14, 2022

An Introduction to Ahana Cloud for Presto on AWS

During this webinar we will share how to build an open data lake stack with Presto and AWS S3 with Ahana Cloud.

Ahana

April 14, 2022
Tweet

More Decks by Ahana

Other Decks in Technology

Transcript

  1. • What is a Data Lakehouse? • What is Presto?

    • What is Ahana? • Ahana Demo Agenda 2
  2. The Next Warehouse: Open Data Lakehouse Data SQL Query Processing

    Data Warehouse TBs SQL Query Processing Reporting & Dashboarding Data Science In-Data Lake Transformation Reporting & Dashboarding Cloud Data Lake TBs -> PBs Open Data Lake Security, Reliability, and Governance
  3. What is Presto? • Open source, distributed MPP SQL query

    engine • Originally developed at Meta/Facebook as a replacement for Hive • Query in Place -- no need to move data(ETL) from source • Federated Querying -- join data from different source format • ANSI SQL Compliant • Designed from ground up for fast analytic queries against data of any size • Proven on petabytes of data • SQL-On-Anything • Federated querying and pluggable architecture to support many connectors • Opensource, hosted on github • https://github.com/prestodb 6
  4. Presto Use Cases Data Lakehouse analytics Reporting & dashboarding Interactive

    ad hoc querying Transformation using SQL (ETL) Federated querying across data sources 7
  5. At A Glance • Ahana - The Company • Ahana

    Cloud is SaaS to Query Data Lakes • Simplifies SQL analytics on cloud data lakes like S3 Team Ahana Cloud, Database & Presto Experts Steven Mih Cofounder CEO Dipti Borkar Cofounder Chief Products Officer Dave Simmen Cofounder Chief Technical Officer 2021 DBTA Best Data 100 2021 Stevie Best Startup 2021 Coolest Analytics 2021 Top 10 Hot Big Data 2020 Datanami Best Big Data Startup Awards 2020 Open Source 100
  6. Ahana Cloud For Presto 1. Zero to Presto in 30

    Minutes. Managed cloud service: No installation and configuration. 2. Built for data teams of all experience level. 3. Moderate level of control of deployment without complexity. 4. Dedicated support from Presto experts.
  7. Challenges with SQL on the Data Lakehouse Cloud DW /

    AWS Serverless options get very expensive for growing data volumes ▪ Cloud data warehouse costs grow much faster than compute engine costs ▪ Serverless options like AWS Athena charge /query and get expensive “Do it yourself” approach is complicated ▪ Big data skills in platform teams are limited ▪ Presto is complicated and operationally very time consuming Presto on AWS like AWS Athena has limited capabilities and doesn’t scale ▪ Limited concurrency of 20 per account ▪ No visibility into cluster logs, query logs, no flexibility / control on scale
  8. Ahana Console (Control Plane) CLUSTER ORCHESTRATION CONSOLIDATED LOGGING SECURITY &

    ACCESS BILLING & SUPPORT In-VPC Presto Clusters (Compute Plane) AD HOC CLUSTER 1 TEST CLUSTER 2 PROD CLUSTER N Glue S3 RDS Elasticsearch Ahana Cloud Account Ahana console oversees and manages every Presto cluster Customer Cloud Account In-VPC orchestration of Presto clusters, where metadata, monitoring, and data sources reside Ahana Cloud for Presto 14
  9. Ahana Cloud – Reference Architecture • Distributed SQL engine with

    proven scalability • Interactive ANSI SQL queries • Query data where it lives with Federated Connectors (no ETL) • High concurrency • Separation of compute and storage 15
  10. Case study: Blinkit • India’s instant delivery service • Moved

    from the Data Warehouse to the Open Data Lakehouse powered by Presto & Ahana to power 200K orders/day • “Everything delivered in 10 minutes” “Ahana is providing Blinkit with a SaaS managed service for Presto, providing the company with the advanced data management capabilities it needs to meet its instant delivery promise.” Satyam Krishna, Engineering Manager at Blinkit
  11. How Carbon uses PrestoDB in the Cloud with Ahana to

    Power its Real-time Customer Dashboards Jordan Hoggart, Data Engineer at Carbon
  12. Ahana Cloud for Presto - Summary ▪ Brings SQL on

    AWS S3 with an open data lake + USER ▪ Presto compute brought to your data in your VPC in your account ▪ Fully managed Presto cluster life cycle including idle-time management ▪ Query AWS DBs - RDS/MySQL , RDS/Postgres, Elasticsearch, Redshift, Elasticsearch ▪ Cloud-native and highly available running on Kubernetes ▪ Bring your own ▪ BI tool / Data Science Notebook ▪ Metadata Catalog ▪ Transaction Manager Easy to use 3x Price Performance Open & Flexible
  13. Give it a spin • Ahana Cloud is available on

    the AWS Marketplace • Sign-up for a 14-day free trial here: https://ahana.io/sign-up 22