Slide 1

Slide 1 text

Data Warehouse or Data Lake: Which One Do I Use?

Slide 2

Slide 2 text

Today’s Speakers John Santaferraro is an industry analyst with Ferraro Consulting. He has 26 years of experience in data and analytics, including research, implementation, and product marketing. Dipti Borkar is the cofounder and CPO at Ahana and an expert in relational and non-relational database engines. She’s also Presto Foundation Outreach Chair. © 2021 Enterprise Management Associates, Inc. 2 John Santaferraro Industry Analyst Ferraro Consulting Dipti Borkar Cofounder & Chief Product Officer Ahana

Slide 3

Slide 3 text

© 2021 Enterprise Management Associates, Inc. 3 Event Recording An archived version of the event recording will be available at www.ahana.com Questions Log questions in the chat panel located on the lower left-hand corner of your screen Questions will be addressed during the Q&A session of the event ? | @ema_research

Slide 4

Slide 4 text

Agenda Traditional Data Lakes and Data Warehouses 1 2 3 Modern Data Lakes and Data Warehouses Merging the Data Lake and the Data Warehouse 4 Use Cases for Modern Platforms 5 The Uber Technical Case Study 6 Questions & Answers © 2021 Enterprise Management Associates, Inc. 4 | @ema_research

Slide 5

Slide 5 text

The Traditional Data Warehouse © 2021 Enterprise Management Associates, Inc. 5 • Relational Database • Columnar Structure • In-Database Analytics • Structured Data • Modeled Data • Extract, Transform, Load • SQL Access • Expensive • Difficult to Manage • Costly to Maintain • Limited Data • Limited Access | @ema_research

Slide 6

Slide 6 text

The Traditional Data Lake © 2021 Enterprise Management Associates, Inc. 6 • File System Data Store • Semi-Structured Data • Ingestion • Discovery • Data Science • Notebook and Python Access • Less expensive, but… • Limited Performance • Limited Analytics • Limited SQL Access • Difficult to Govern | @ema_research

Slide 7

Slide 7 text

The Drivers Behind Modernization © 2021 Enterprise Management Associates, Inc. 7 Digital Transformation Real Time Events Automation of Everything More Data Fast Data Smart Data

Slide 8

Slide 8 text

The Modern Data Warehouse v. The Modern Data Lake • Cloud-First • In-Memory Capabilities • Complex Data Types • Separate Storage & Compute • Expanded Analytics • Improved Performance • Storage Options • SQL Access © 2021 Enterprise Management Associates, Inc. 8 | @ema_research • Cloud-First • In-Memory Capabilities • Columnar Data Types • Separate Storage & Compute • Expanded Analytics • Improved Performance • Storage Options • SQL Access

Slide 9

Slide 9 text

© 2021 Enterprise Management Associates, Inc. 9 From Data to Insight - The SQL Query Engine Data SQL Query Processing Data Warehouse Cloud Data Lake Open Source Data Warehouse SQL Query Processing 1-10 TB 1TB -> PB Reporting & Dashboarding Data Science In-data lake transformation

Slide 10

Slide 10 text

Use Cases Criteria for the Modern Data Lake and Data Warehouse Modern Data Lake 1. High-Performance, Data Intensive 2. Lower Cost Storage 3. Massive Scale 4. Diverse Data Types 5. Diverse Analytical Types 6. Diverse Access Types 7. Enterprise Capabilities 8. High Concurrency of Analytics Modern Data Warehouse 1. High-Performance, Compute Intensive 2. Lower Cost Storage 3. Enterprise Capabilities 4. High Concurrency of Analytics 5. Diverse Analytical Types 6. Massive Scale 7. Diverse Data Types 8. Diverse Access Types © 2021 Enterprise Management Associates, Inc. 10 | @ema_research

Slide 11

Slide 11 text

Merging the Cloud Data Warehouse and the Cloud Data Lake © 2021 Enterprise Management Associates, Inc. 11 1. From two platforms to one 2. From two resource types to one 3. From self-managed to fully managed 4. From complex query joins to simple 5. From disparate to connected intelligence

Slide 12

Slide 12 text

Merging the Data Warehouse and the Data Lake with a Distributed Query Engine © 2021 Enterprise Management Associates, Inc. 12 1. SQL Access 2. Data Lake and Data Warehouse Access 3. Unified Analytics 4. Distributed Queries 5. Limitless Scale 6. Complex Data Types • Leverage Resources • Better Insight • More Use Cases • Leverage Platforms • Remove Limits • Amplified Insight

Slide 13

Slide 13 text

Considerations for Any Unified Analytics Decision © 2021 Enterprise Management Associates, Inc. 13 | @ema_research Data Analytics Users Platform Cloud Enterprise Business Cost

Slide 14

Slide 14 text

Considerations for Any Unified Analytics Decision © 2021 Enterprise Management Associates, Inc. 14 | @ema_research Sample Size or Source Info goes here: Data Structured Semi- Structured Real Time Structured Complex Data Types Textual Streaming

Slide 15

Slide 15 text

Considerations for Any Unified Analytics Decision © 2021 Enterprise Management Associates, Inc. 15 | @ema_research Data Analytics Users Platform SQL Python Notebook Search

Slide 16

Slide 16 text

Considerations for Any Unified Analytics Decision © 2021 Enterprise Management Associates, Inc. 16 | @ema_research Data Analytics Users Platform Engineer Analyst Scientist Business

Slide 17

Slide 17 text

Considerations for Any Unified Analytics Decision © 2021 Enterprise Management Associates, Inc. 17 | @ema_research Data Analytics Users Platform Cloud Enterprise Business Cost

Slide 18

Slide 18 text

Considerations for Any Unified Analytics Decision © 2021 Enterprise Management Associates, Inc. 18 | @ema_research Elasticity Scale Mobility Globality Cloud Enterprise Business Cost

Slide 19

Slide 19 text

Considerations for Any Unified Analytics Decision © 2021 Enterprise Management Associates, Inc. 19 | @ema_research Security Privacy Governance Unification Cloud Enterprise Business Cost

Slide 20

Slide 20 text

Considerations for Any Unified Analytics Decision © 2021 Enterprise Management Associates, Inc. 20 | @ema_research Semantics Logic Value Optimization Cloud Enterprise Business Cost

Slide 21

Slide 21 text

Considerations for Any Unified Analytics Decision © 2021 Enterprise Management Associates, Inc. 21 | @ema_research Forecast Containment Chargeback Scale Cloud Enterprise Business Cost

Slide 22

Slide 22 text

Considerations for Any Unified Analytics Decision © 2021 Enterprise Management Associates, Inc. 22 | @ema_research Data Analytics Users Platform Cloud Enterprise Business Cost

Slide 23

Slide 23 text

Uber: A User and Developer of Presto Presto and Hyperscale Analytics 10,000 cities, 18+ million trips per day, 256 petabytes of stored data, 35 petabytes of new data every day, 12,000 monthly active users of analytics running more than 400,000 queries every single day Presto and Enterprise Readiness Automation, Workload Management, Complex Queries, Security Presto and Technical Value Extended Analytics: Analytical Functions, Complex Data Types Expanded Use Cases: ETL, Data Science, Exploration, Online Analytical Processing, Federated Queries Presto and the Future Realtime, Exabyte Scale, Sampling, Optimization Project Aria, Project Presto Unlimited, Fireball © 2021 Enterprise Management Associates, Inc. 23 | @ema_research

Slide 24

Slide 24 text

Let’s Talk About Ahana! © 2021 Enterprise Management Associates, Inc. 24 | @ema_research

Slide 25

Slide 25 text

25

Slide 26

Slide 26 text

26 How Ahana Cloud works? ~ 30 mins to create the compute plane https://app.ahana.cloud/signup Create Presto Clusters in your account

Slide 27

Slide 27 text

27 Ahana Cloud – Reference Architecture • Distributed SQL engine with proven scalability • Interactive ANSI SQL queries • Query data where it lives with Federated Connectors (no ETL) • High concurrency • Separation of compute and storage

Slide 28

Slide 28 text

Questions? Please log your questions in the Q&A window © 2021 Enterprise Management Associates, Inc. 28 | @ema_research