Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Presto Performance Tuning

Presto Performance Tuning

Avatar for Nellie Maysela

Nellie Maysela

March 11, 2020
Tweet

Other Decks in Technology

Transcript

  1. Only specifying the columns you need! [Bad]: SELECT * FROM

    table [Good]: SELECT id, name, role FROM table
  2. Example WHERE time <=> Integer SELECT ... WHERE time >

    1349393020 TD_TIME_RANGE SELECT … WHERE TD_TIME_RANGE(time, '2020-03-01','2020-03-15', 'UTC')
  3. Considering the cardinality within GROUP BY [BAD]: SELECT ... GROUP

    BY gender, id [GOOD]: SELECT … GROUP BY id, gender *Another tip: Use numbers instead of strings for the GROUP BY column
  4. Use LIMIT with ORDER BY [BAD]: SELECT * FROM tbl

    ORDER BY time [GOOD]: SELECT * FROM tbl ORDER BY time LIMIT 10
  5. SELECT id, name FROM access WHERE method LIKE '%GET%' OR

    method LIKE '%POST%' OR method LIKE '%PUT%' OR method LIKE '%DELETE%' SELECT Id, name FROM access WHERE regexp_like(method, 'GET|POST|PUT|DELETE') *Or in other DB: PostgreSQL: SIMILAR TO Redshift: SIMILAR TO
  6. Specifying large tables first in join clause “The default join

    algorithm of Presto is broadcast join”
  7. Improving Performance of Very Large Queries CTAS ♥ DROP DROP

    TABLE IF EXISTS my_result; CREATE TABLE my_result AS SELECT * FROM my_table;
  8. Here is a list of memory-intensive • Distinct: ◦ SELECT

    distinct c1, c2, c3 FROM my_table ◦ Count(Distinct x) -> Approx_distinct() • UNION: Prefer UNION ALL over UNION • ORDER BY: Avoid ORDER BY • GROUP BY: Reduce the number of GROUP BY columns • Joins: From the largest table to smaller tables ◦ Use distributed joins ◦ -- set session join_distribution_type = 'PARTITIONED' SELECT * FROM table_a, table_b WHERE table_a.id = table_b.id
  9. Optimize this Query! SELECT hotel_name FROM hotel_tb WHERE pic_role =

    ‘Hotel Owner' OR pic_role = 'Assistant' OR pic_role = 'Hotel Ops'
  10. Optimize this Query! SELECT pic_name, pic_contact FROM hotel_tb WHERE pic_role

    = ‘Hotel Ops’ UNION SELECT pic_name, pic_contact FROM hotel_tb WHERE pic_role like ‘h%’ OR hotel_address = ‘Letjen S Parman no 23’
  11. • 7.10. Regular Expression Functions — Presto 0.233 Documentation •

    Pattern (Java Platform SE 8 ) • 1.2. Presto Concepts — Presto 0.233.1 Documentation • Treasure Data Function Reference • 5. Query Optimizer — Presto 331 Documentation Additional Link