Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Presto Performance Tuning

Sponsored · Your Podcast. Everywhere. Effortlessly. Share. Educate. Inspire. Entertain. You do you. We'll handle the rest.

Presto Performance Tuning

Avatar for Nellie Maysela

Nellie Maysela

March 11, 2020
Tweet

Other Decks in Technology

Transcript

  1. Only specifying the columns you need! [Bad]: SELECT * FROM

    table [Good]: SELECT id, name, role FROM table
  2. Example WHERE time <=> Integer SELECT ... WHERE time >

    1349393020 TD_TIME_RANGE SELECT … WHERE TD_TIME_RANGE(time, '2020-03-01','2020-03-15', 'UTC')
  3. Considering the cardinality within GROUP BY [BAD]: SELECT ... GROUP

    BY gender, id [GOOD]: SELECT … GROUP BY id, gender *Another tip: Use numbers instead of strings for the GROUP BY column
  4. Use LIMIT with ORDER BY [BAD]: SELECT * FROM tbl

    ORDER BY time [GOOD]: SELECT * FROM tbl ORDER BY time LIMIT 10
  5. SELECT id, name FROM access WHERE method LIKE '%GET%' OR

    method LIKE '%POST%' OR method LIKE '%PUT%' OR method LIKE '%DELETE%' SELECT Id, name FROM access WHERE regexp_like(method, 'GET|POST|PUT|DELETE') *Or in other DB: PostgreSQL: SIMILAR TO Redshift: SIMILAR TO
  6. Specifying large tables first in join clause “The default join

    algorithm of Presto is broadcast join”
  7. Improving Performance of Very Large Queries CTAS ♥ DROP DROP

    TABLE IF EXISTS my_result; CREATE TABLE my_result AS SELECT * FROM my_table;
  8. Here is a list of memory-intensive • Distinct: ◦ SELECT

    distinct c1, c2, c3 FROM my_table ◦ Count(Distinct x) -> Approx_distinct() • UNION: Prefer UNION ALL over UNION • ORDER BY: Avoid ORDER BY • GROUP BY: Reduce the number of GROUP BY columns • Joins: From the largest table to smaller tables ◦ Use distributed joins ◦ -- set session join_distribution_type = 'PARTITIONED' SELECT * FROM table_a, table_b WHERE table_a.id = table_b.id
  9. Optimize this Query! SELECT hotel_name FROM hotel_tb WHERE pic_role =

    ‘Hotel Owner' OR pic_role = 'Assistant' OR pic_role = 'Hotel Ops'
  10. Optimize this Query! SELECT pic_name, pic_contact FROM hotel_tb WHERE pic_role

    = ‘Hotel Ops’ UNION SELECT pic_name, pic_contact FROM hotel_tb WHERE pic_role like ‘h%’ OR hotel_address = ‘Letjen S Parman no 23’
  11. • 7.10. Regular Expression Functions — Presto 0.233 Documentation •

    Pattern (Java Platform SE 8 ) • 1.2. Presto Concepts — Presto 0.233.1 Documentation • Treasure Data Function Reference • 5. Query Optimizer — Presto 331 Documentation Additional Link