Upgrade to Pro — share decks privately, control downloads, hide ads and more …

When ms matter: Maximizing query performance in...

When ms matter: Maximizing query performance in CrateDB

Achieving optimal execution plans in distributed databases is a challenging task. This talk will focus on CrateDB: a distributed SQL database, and key strategies for optimizing its query performance.

cratedb

June 23, 2023
Tweet

More Decks by cratedb

Other Decks in Technology

Transcript

  1. 2 About the speaker https://marijaselakovic.github.io/ https://www.linkedin.com/in/marija-selakovic/ • Developer advocate @

    CrateDB • Ph.D. in computer science at TU Darmstadt • Background in software engineering and data science • Interests in big data, community engagement, and helping developers to be successful with CrateDB
  2. 3 Distributed databases Network • Performance and availability • Cost-effectiveness

    • Scalability • Fault tolerance • Data consistency • Flexibility
  3. Search Engine SQL NoSQL Performance § Ad-hoc queries across billions

    of records in < 1s § Columnar storage and parallel execution for fast aggregations § Fully distributed query engine on top of Apache Lucene® Ease of Use and Flexibility § Standard SQL and PostgreSQL Wire Protocol § Analyze relational, object, time-series, geo and full-text data all in one system § Fully-managed Cloud, Hybrid and On-Premises (OSS) Scalability and Reliability § Scale from a single node to hundreds analyzing petabytes of data § Analyze historical data, no down-sampling needed § Built-in high availability and automatic failover This talk: CrateDB Distributed SQL database for real-time analytics
  4. 6 Common performance issues (1) Inefficient query execution planner doesn’t

    provide optimal execution plan Full table scan in the query execution plan orders SELECT * FROM orders WHERE right(name, 1) = 'r';
  5. 7 Common performance issues (2) Wrong data model Table with

    too many columns, wrong data type, normalized schemas Affiliation UserData Address Contact
  6. 8 Common performance issues (3) Unbalanced data distribution The wrong

    combination of sharding and partitioning table t1 shard t1.s1 shard t1.s2 shard t1.s1000 … Too many or too few shards
  7. 9 Other performance issues Node failures If happens during query

    execution, the query may need to be rerun on a different node High memory usage Critical utilization of node memory Resource contention Multiple queries trying to access the same resource simultanously leads to longer execution times
  8. 11 Logical plan An abstraction of all transformation steps needed

    to execute a query. Collect operator: • table name (doc.users) • attribute (name) • query expression(true) a condition replaces true expression EXPLAIN SELECT * FROM users WHERE name = 'name'; └ Collect[doc.users | [name, age, ...] | (name = 'name')] EXPLAIN SELECT name FROM users; └ Collect[doc.users | [name] | true]
  9. 12 Understand your queries: example CREATE TABLE t1(x int, y

    int); CREATE TABLE t2(x int, y int); EXPLAIN SELECT * FROM (SELECT t1.x as t1x, t2.x as t2x, t1.y as t1y, t2.y as t2y FROM t1 JOIN t2 ON t2.y > t1.x) AS t WHERE t.t1x = t.t2x; Rename[t1x, t2x, t1y, t2y] AS t └ Eval[*] └ Filter[(x AS t1x = x AS t2x)] └ NestedLoopJoin[INNER | (y > x)] ├ Collect[doc.t1 | [x, y] | true] └ Collect[doc.t2 | [x, y] | true] EXPLAIN SELECT * FROM t1 JOIN t2 ON t1.x = t2.x WHERE t2.y > t1.x; HashJoin[((x = x) AND (y > x))] ├ Collect[doc.t1 | [x, y] | true] └ Collect[doc.t2 | [x, y] | true] https://crate.io/resources/videos/distributed-join-algorithms
  10. 14 Sharding • Logical plan does not reason about data

    distribution • Every table is split into multiple shards by default • Shards are distributed evenly across all nodes in a cluster CREATE TABLE t1 ( name STRING ) CLUSTERED INTO 3 SHARDS; table t1 shard t1.s1 shard t1.s1 shard t1.s2 shard t1.s3
  11. Partitioning splits up a large table into smaller chunks to

    minimize the number of records that need to be queried. Partitioning table t1 January February shard t1.p1.s1 February shard t1.p2.s1 January t1.p2.s1 t1.p1.s1 t1.p2.s2 t1.p1.s2 t1.p2.s3 t1.p1.s3 CREATE TABLE t1 ( name STRING, month TIMESTAMP ) CLUSTERED INTO 3 SHARDS PARTITIONED BY (month);
  12. table t1 shard t1.s1 Replication / High Availability shard t1.r1

    shard t1.r2 shard t1.s1 shard t1.r3 shard t1.s2 shard t1.r1 shard t1.s3 January January January CREATE TABLE t1 ( name STRING ) CLUSTERED INTO 3 SHARDS WITH (“number_of_replicas” = '1'); • Increase parallel read performance • Data is replicated on a shard level • Replica shards can become primary shard in case of failures • Should have at least 1 replica to avoid downtimes
  13. Find the balance § More replicas § Higher read throughput

    § Lower write throughput § More shards and partitions § Improved scalability and performance § Complexity and cost, limited scalability
  14. How many shards? • Ideally for every table on every

    node, you want to have #shards = #CPU cores • Plan for the future, #tables, #of concurrent queries on those tables • Use dedicated master (master-eligible) nodes in the cluster • 1GB heap on dedicated master servers ~ 3000 shards Size of Shards • Too large shards can result in long delays during cluster recovering • Better to plan ahead than increasing/decreasing #shards on a production cluster • Aim for 10GB-50GB shards
  15. Example sharding CREATE TABLE t1(a int, b int) CLUSTERED BY

    (b) INTO 5 shards; EXPLAIN ANALYZE SELECT * FROM t1 WHERE a = 11; 5x "QueryDescription": "a:[11 TO 11]", "QueryName": "PointRangeQuery" Hits 5 shards EXPLAIN ANALYZE SELECT * FROM t1 WHERE a = 11 AND b=1; Hits 1 shard 1x "QueryName": "PointRangeQuery” "QueryName": "PointRangeQuery" "QueryName": "BooleanQuery"
  16. Partitioning strategy Partitioning § Each table partition is internally just

    another table which each own shards § A table with 20 partitions and 10 shards has 200 shards § Choose the correct scheme for your needs, e.g.,: § daily partitions could be too fine grain and you end up with lots of shards. § Changing the partitioning scheme #shards/#replicas only apply to new partitions § Snapshot and close/drop old partitions, if needed to query on archived data, restore the partitions you need
  17. Example partitioning CREATE TABLE t1(a int, b generated always as

    a % 10) CLUSTERED INTO 5 shards PARTITIONED BY (b); EXPLAIN ANALYZE SELECT * FROM t1 WHERE a = 11; 50x "QueryDescription": "a:[11 TO 11]", "QueryName": "PointRangeQuery" Hits 10 partitions, 50 shards EXPLAIN ANALYZE SELECT * FROM t1 WHERE a = 11 AND b=1; Hits 1 partition, 5 shards 5x "QueryDescription": "a:[11 TO 11]", "QueryName": "PointRangeQuery"
  18. Other performance considerations • Make use of the correct LIMIT,

    more than rarely 10k rows are needed • Try to avoid joins as much as possible, use denormalized table schemes • Make use of the right function/operator EXPLAIN ANALYZE SELECT * FROM t1 WHERE s LIKE '%r'; EXPLAIN ANALYZE SELECT * FROM t1 WHERE right(s, 1) = 'r'; "QueryDescription": "s:*r", "QueryName": "MultiTermQueryConstantScoreBlendedWrapper" "QueryDescription": "(right(s, 1) = r)", "QueryName": "GenericFunctionQuery"
  19. Lucene vs SQL CREATE TABLE t1(x int, a int[]) CLUSTERED

    INTO 1 SHARDS; • Consider cases that involve NULL and 3-value logic • ignore3vl function elimitates 3-value logic on the whole tree of operators EXPLAIN ANALYZE SELECT * FROM t1 WHERE NOT 5 = ANY(a); EXPLAIN ANALYZE SELECT * FROM t1 WHERE NOT IGNORE3vl(5 = ANY(a)); 3 nested queries Running time: >1.3s 2 nested queries, eliminates 3-value logic Running time: <0.03s
  20. Community example • https://community.crate.io/t/very-slow-query-performance/1321 • Complex query involving two tables

    • WITH CTE provides more opportunity to optimize the logical plan (from 19s to 6s) • Further replacing BETWEEN with simpler filter (i.e., > and <=) reduces the execution time to 200 ms!
  21. Want to learn more? Join our community • https://community.crate.io/ Read

    more • Documentation: https://crate.io/docs/crate/ • Visit https://crate.io/docs/crate/howtos/en/latest/performance/index.html to learn more about performance Try CrateDB • In the cloud for free: https://crate.io/lp-free-trial • On-premise: https://crate.io/download#cratedb Contact us: [email protected]