Level 103 for Presto: Deep Dive into the PrestoDB Architecture at Twitter

Level 103 for Presto: Deep Dive into the PrestoDB architecture
at Twitter Part 3 of the Tech Talk Series for Presto Chunxu Tang Software Engineer, Twitter Hosted by

Presto 103 Agenda • Presto Architecture • How Twitter Scales
Presto Clusters • Story of Query Predictor at Twitter

Presto Architecture

How Presto Works

How Twitter Scales Presto Clusters

Presto at Twitter >6 Clusters >2000 Presto Workers >10 Million
Queries

How to Scale? • Scalability ◦ Single coordinator with 1000+
workers ◦ Hundreds of queries in one waiting queue ◦ Splitted UIs • Availability ◦ Downtime during deployment ◦ Single point of failure

Presto Federation • Scalability ◦ Easily scaled up ◦ Waiting
queue splitted to sub-clusters ◦ Aggregated UI and APIs • Availability ◦ Rolling deploy ◦ No single point of failure

Basic Scheduling • Starvation • Waste of resources • Long
waiting time

Intelligent Scheduling

Story of Query Predictor at Twitter

Query Predictor Quick estimation of CPU and memory resources needed
for a SQL query.

High-Level Design

Machine Learning Pipeline

Problem Deﬁnition

Query Predictor Pipeline

Model Evaluation (Accuracy) 94.2% 92.4%

Model Evaluation (Precision & Recall) Precision Recall F1-score < 30s
0.98 0.98 0.98 30s - 1h 0.89 0.91 0.90 1h - 5h 0.85 0.74 0.79 > 5h 0.89 0.90 0.89 Precision Recall F1-score < 1MB 0.98 0.92 0.95 1MB - 1TB 0.77 0.88 0.83 > 1TB 0.94 0.96 0.95 CPU time Peak memory bytes

An example query from TPC-H benchmark

Join the Presto Community • Require new feature or ﬁle
a bug: github.com/prestodb/presto • Slack: prestodb.slack.com • Twitter: @prestodb Stay up-to-date with Ahana • URL: ahana.io • Twitter: @ahanaio

Level 103 for Presto: Deep Dive into the Presto...

Level 103 for Presto: Deep Dive into the PrestoDB Architecture at Twitter

Ahana

More Decks by Ahana

Other Decks in Technology

Featured

Transcript