Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Level 103 for Presto: Deep Dive into the PrestoDB Architecture at Twitter

Ahana
August 25, 2020

Level 103 for Presto: Deep Dive into the PrestoDB Architecture at Twitter

In this tech talk, we’ll take a look at the PrestoDB architecture in more technical depth. Next, we’ll have the Presto engineers at Twitter share their PrestoDB use case and architecture which includes several large Presto clusters with over 2,000 nodes.

You’ll learn:

- More in-depth technical concepts around Presto
- How Twitter uses Presto as a highly scalable query predictor service
- The PrestoDB architecture at Twitter

Ahana

August 25, 2020
Tweet

More Decks by Ahana

Other Decks in Technology

Transcript

  1. Level 103 for Presto: Deep Dive into the PrestoDB architecture

    at Twitter Part 3 of the Tech Talk Series for Presto Chunxu Tang Software Engineer, Twitter Hosted by
  2. Presto 103 Agenda • Presto Architecture • How Twitter Scales

    Presto Clusters • Story of Query Predictor at Twitter
  3. How to Scale? • Scalability ◦ Single coordinator with 1000+

    workers ◦ Hundreds of queries in one waiting queue ◦ Splitted UIs • Availability ◦ Downtime during deployment ◦ Single point of failure
  4. Presto Federation • Scalability ◦ Easily scaled up ◦ Waiting

    queue splitted to sub-clusters ◦ Aggregated UI and APIs • Availability ◦ Rolling deploy ◦ No single point of failure
  5. Model Evaluation (Precision & Recall) Precision Recall F1-score < 30s

    0.98 0.98 0.98 30s - 1h 0.89 0.91 0.90 1h - 5h 0.85 0.74 0.79 > 5h 0.89 0.90 0.89 Precision Recall F1-score < 1MB 0.98 0.92 0.95 1MB - 1TB 0.77 0.88 0.83 > 1TB 0.94 0.96 0.95 CPU time Peak memory bytes
  6. Q&A

  7. Join the Presto Community • Require new feature or file

    a bug: github.com/prestodb/presto • Slack: prestodb.slack.com • Twitter: @prestodb Stay up-to-date with Ahana • URL: ahana.io • Twitter: @ahanaio