"Finding Bugs in Database Systems via Query Partitioning" at OOPSLA '20

389c8e3d83119ec458c5c57e8d92da2a?s=47 Manuel Rigger
October 29, 2020
21

"Finding Bugs in Database Systems via Query Partitioning" at OOPSLA '20

389c8e3d83119ec458c5c57e8d92da2a?s=128

Manuel Rigger

October 29, 2020
Tweet

Transcript

  1. Finding Bugs in Database Systems via Query Partitioning Manuel Rigger

    Zhendong Su ETH Zurich, Switzerland OOPSLA 2020 @RiggerManuel @ast_eth https://people.inf.ethz.ch/suz/
  2. 2 Database Management Systems (DBMSs) PostgreSQL

  3. 3 PostgreSQL Database Management Systems (DBMSs) MySQL is the most

    popular open-source DBMS
  4. 4 Database Management Systems (DBMSs) PostgreSQL TiDB and CockroachDB are

    highly-popular, commercially developed NewSQL DBMSs
  5. 5 PostgreSQL Database Management Systems (DBMSs) DuckDB is an emerging

    embedded DBMS for analytical query workloads
  6. 6 Database Management Systems (DBMSs) PostgreSQL We found 193 bugs

    in these systems, 142 of which have been fixed
  7. 7 Goal: Find Logic Bugs Logic bugs: DBMS returns an

    incorrect result set
  8. 8 Motivating Example c0 0 c0 -0 t0 t1 SELECT

    * FROM t0, t1 WHERE t0.c0 = t1.c0; ?
  9. 9 Motivating Example c0 0 c0 -0 t0 t1 SELECT

    * FROM t0, t1 WHERE t0.c0 = t1.c0; t0.c0 t0.c1 0 -0 The DBMS is expected to fetch the row since 0 = -0 should evaluate to TRUE ✓
  10. 10 Motivating Example c0 0 c0 -0 t0 t1 SELECT

    * FROM t0, t1 WHERE t0.c0 = t1.c0; t0.c0 t0.c1 The latest version of MySQL that we tested failed to fetch the row  https://bugs.mysql.com/bug.php?id=99122
  11. 11 Motivating Example c0 0 c0 -0 t0 t1 SELECT

    * FROM t0, t1 WHERE t0.c0 = t1.c0; t0.c0 t0.c1 Challenge: Finding a test oracle that can automatically validate the result set Incorrect result! https://bugs.mysql.com/bug.php?id=99122
  12. 12 Background: Differential Testing Query Generator

  13. 13 Background: Differential Testing Query Generator RS1 RS2 RS3

  14. 14 Background: Differential Testing Query Generator RS1 = RS2 =

    RS3 ? RS1 RS2 RS3
  15. 15 Background: Differential Testing DBMS- specific SQL Common SQL Core

  16. 16 Background: Differential Testing DBMS- specific SQL Common SQL Core

    “We are unable to use Postgres as an oracle because CockroachDB has slightly different semantics and SQL support, and generating queries that execute identically on both is tricky […].” – Cockroach Labs
  17. 17 Core Idea Novel test oracle: Partition a query into

    multiple partitioning queries, from which the original query’s result set can be derived
  18. 18 Query Partitioning Query Generator Q RS(Q)

  19. 19 Query Partitioning Query Generator Q RS(Q) Q denotes the

    (randomly-generated) original query
  20. 20 Query Partitioning Query Generator Q RS(Q) RS(Q) denotes the

    original query’s result set
  21. 21 Query Partitioning Query Generator Q RS(Q) RS(Q’1 ) RS(Q’2

    ) RS(Q’3 ) RS(Q’n ) Q’1 Q’2 Q’3 Q’n Partition the result set
  22. 22 Query Partitioning Query Generator Q RS(Q) RS(Q’) RS(Q’1 )

    RS(Q’2 ) RS(Q’3 ) RS(Q’n ) Q’1 Q’2 Q’3 Q’n ♢ Q’
  23. 23 Query Partitioning Query Generator Q RS(Q) RS(Q’) RS(Q’1 )

    RS(Q’2 ) RS(Q’3 ) RS(Q’n ) Q’1 Q’2 Q’3 Q’n ♢ Q’ Combine the results so that RS(Q)=RS(Q’)
  24. 24 Query Partitioning Query Generator Q RS(Q) RS(Q’) RS(Q’1 )

    RS(Q’2 ) RS(Q’3 ) RS(Q’n ) Q’1 Q’2 Q’3 Q’n ♢ Q’ ✓ =
  25. 25 Query Partitioning Query Generator Q RS(Q) RS(Q’) RS(Q’1 )

    RS(Q’2 ) RS(Q’3 ) RS(Q’n ) Q’1 Q’2 Q’3 Q’n ♢ Q’ ≠
  26. 26 Query Partitioning Query Generator Q RS(Q) RS(Q’) RS(Q’1 )

    RS(Q’2 ) RS(Q’3 ) RS(Q’n ) Q’1 Q’2 Q’3 Q’n ♢ Q’ Partitioning queries
  27. 27 Query Partitioning Query Generator Q RS(Q) RS(Q’) RS(Q’1 )

    RS(Q’2 ) RS(Q’3 ) RS(Q’n ) Q’1 Q’2 Q’3 Q’n ♢ Q’ Partitions
  28. 28 Query Partitioning Query Generator Q RS(Q) RS(Q’) RS(Q’1 )

    RS(Q’2 ) RS(Q’3 ) RS(Q’n ) Q’1 Q’2 Q’3 Q’n ♢ Q’ Composition operator
  29. 29 How to Realize This Idea? Key challenge: find a

    valid partitioning strategy that stresses the DBMS
  30. 30 Ternary Logic Consider a predicate φ and a given

    row r. Exactly one of the following must hold: • φ • NOT φ • φ IS NULL
  31. 31 Ternary Logic Consider a predicate φ and a given

    row r. Exactly one of the following must hold: • φ • NOT φ • φ IS NULL
  32. 32 Ternary Logic Consider a predicate φ and a given

    row r. Exactly one of the following must hold: • φ • NOT φ • φ IS NULL φ NOT φ φ IS NULL
  33. 33 Ternary Logic Consider a predicate φ and a given

    row r. Exactly one of the following must hold: • φ • NOT φ • φ IS NULL ternary predicate variants
  34. 34 Motivating Example c0 0 c0 -0 t0 t1 SELECT

    * FROM t0, t1 WHERE t0.c0 = t1.c0; t0.c0 t0.c1  https://bugs.mysql.com/bug.php?id=99122 How did this insight allow us to detect this bug?
  35. 35 Example: MySQL SELECT * FROM t0, t1;

  36. 36 Example: MySQL t0.c0 t0.c1 0 -0 SELECT * FROM

    t0, t1;
  37. 37 Example: MySQL t0.c0 t0.c1 0 -0 SELECT * FROM

    t0, t1 WHERE t0.c0=t1.c0 UNION ALL SELECT * FROM t0, t1 WHERE NOT (t0.c0=t1.c0) UNION ALL SELECT * FROM t0, t1 WHERE (t0.c0=t1.c0) IS NULL; SELECT * FROM t0, t1;
  38. 38 Example: MySQL t0.c0 t0.c1 0 -0 SELECT * FROM

    t0, t1 WHERE t0.c0=t1.c0 UNION ALL SELECT * FROM t0, t1 WHERE NOT (t0.c0=t1.c0) UNION ALL SELECT * FROM t0, t1 WHERE (t0.c0=t1.c0) IS NULL; SELECT * FROM t0, t1; φ
  39. 39 Example: MySQL t0.c0 t0.c1 0 -0 SELECT * FROM

    t0, t1 WHERE t0.c0=t1.c0 UNION ALL SELECT * FROM t0, t1 WHERE NOT (t0.c0=t1.c0) UNION ALL SELECT * FROM t0, t1 WHERE (t0.c0=t1.c0) IS NULL; SELECT * FROM t0, t1; t0.c0 t0.c1 φ
  40. 40 Example: MySQL t0.c0 t0.c1 0 -0 SELECT * FROM

    t0, t1 WHERE t0.c0=t1.c0 UNION ALL SELECT * FROM t0, t1 WHERE NOT (t0.c0=t1.c0) UNION ALL SELECT * FROM t0, t1 WHERE (t0.c0=t1.c0) IS NULL; SELECT * FROM t0, t1; t0.c0 t0.c1 ≠ φ
  41. 41 Ternary Logic Partitioning (TLP) Query Generator Q RS(Q) RS(Q’)

    Q’p Q’p IS NULL ♢ Q’ Q ¬p RS(Q’p ) RS(Q ¬p ) RS(Q’p IS NULL ) SELECT * FROM t0, t1;
  42. 42 Ternary Logic Partitioning (TLP) Query Generator Q RS(Q) RS(Q’)

    Q’p Q’p IS NULL ♢ Q’ Q ¬p RS(Q’p ) RS(Q ¬p ) RS(Q’p IS NULL ) SELECT * FROM t0, t1 WHERE t0.c0=t1.c0
  43. 43 Ternary Logic Partitioning (TLP) Query Generator Q RS(Q) RS(Q’)

    Q’p Q’p IS NULL ♢ Q’ Q ¬p RS(Q’p ) RS(Q ¬p ) RS(Q’p IS NULL ) SELECT * FROM t0, t1 WHERE NOT (t0.c0=t1.c0)
  44. 44 Ternary Logic Partitioning (TLP) Query Generator Q RS(Q) RS(Q’)

    Q’p Q’p IS NULL ♢ Q’ Q ¬p RS(Q’p ) RS(Q ¬p ) RS(Q’p IS NULL ) SELECT * FROM t0, t1 WHERE (t0.c0=t1.c0) IS NULL;
  45. 45 Ternary Logic Partitioning (TLP) Query Generator Q RS(Q) RS(Q’)

    Q’p Q’p IS NULL ♢ Q’ Q ¬p RS(Q’p ) RS(Q ¬p ) RS(Q’p IS NULL ) UNION ALL
  46. 46 Scope • WHERE • GROUP BY • HAVING •

    DISTINCT queries • Aggregate functions
  47. 47 Testing WHERE Clauses Q Q’ptern ♢(Q’p , Q’¬p ,

    Q’p IS NULL ) SELECT <columns> FROM <tables> [<joins>] SELECT <columns> FROM <tables> [<joins>] WHERE ptern Q′p ⊎ Q′¬p ⊎ Q′p IS NULL
  48. 48 Testing WHERE Clauses Q Q’ptern ♢(Q’p , Q’¬p ,

    Q’p IS NULL ) SELECT <columns> FROM <tables> [<joins>] SELECT <columns> FROM <tables> [<joins>] WHERE ptern Q′p ⊎ Q′¬p ⊎ Q′p IS NULL The multiset addition can be implemented using UNION ALL
  49. 49 Scope • WHERE • GROUP BY • HAVING •

    DISTINCT queries • Aggregate functions
  50. 50 Testing Self-decomposable Aggregate Functions Q Q’ptern ♢(Q’p , Q’¬p

    , Q’p IS NULL ) SELECT MAX(<e>) FROM <tables> [<joins>] SELECT MAX(<e>) FROM <tables> [<joins>] WHERE ptern; MAX(Q′p ⊎ Q′¬p ⊎ Q′p IS NULL )
  51. 51 Testing Self-decomposable Aggregate Functions Q Q’ptern ♢(Q’p , Q’¬p

    , Q’p IS NULL ) SELECT MAX(<e>) FROM <tables> [<joins>] SELECT MAX(<e>) FROM <tables> [<joins>] WHERE ptern; MAX(Q′p ⊎ Q′¬p ⊎ Q′p IS NULL ) A partition is an intermediate result, rather than a subset of the result set
  52. 52 Bug Example: CockroachDB SET vectorize=experimental_on; CREATE TABLE t0(c0 INT);

    CREATE TABLE t1(c0 BOOL) INTERLEAVE IN PARENT t0(rowid); INSERT INTO t0(c0) VALUES (0); INSERT INTO t1(rowid, c0) VALUES(0, TRUE);
  53. 53 Bug Example: CockroachDB SET vectorize=experimental_on; CREATE TABLE t0(c0 INT);

    CREATE TABLE t1(c0 BOOL) INTERLEAVE IN PARENT t0(rowid); INSERT INTO t0(c0) VALUES (0); INSERT INTO t1(rowid, c0) VALUES(0, TRUE); NULL SELECT MAX(t1.rowid) FROM t1;
  54. 54 Bug Example: CockroachDB SET vectorize=experimental_on; CREATE TABLE t0(c0 INT);

    CREATE TABLE t1(c0 BOOL) INTERLEAVE IN PARENT t0(rowid); INSERT INTO t0(c0) VALUES (0); INSERT INTO t1(rowid, c0) VALUES(0, TRUE); NULL 0 SELECT MAX(aggr) FROM ( SELECT MAX(t1.rowid) as aggr FROM t1 WHERE '+' >= t1.c0 UNION ALL SELECT MAX(t1.rowid) as aggr FROM t1 WHERE NOT('+' >= t1.c0) UNION ALL SELECT MAX(t1.rowid) as aggr FROM t1 WHERE ('+' >= t1.c0) IS NULL ); SELECT MAX(t1.rowid) FROM t1; ≠
  55. 55 Testing Decomposable Aggregate Functions Q Q’ptern ♢(Q’p , Q’¬p

    , Q’p IS NULL ) SELECT AVG(<e>) FROM <tables> [<joins>]; SELECT SUM(<e>) as s, COUNT(<e>) as s FROM <tables> [<joins>]; SUM(s(Q′p ⊎ Q′¬p ⊎ Q′p IS NULL )) SUM(c(Q′p ⊎ Q′¬p ⊎ Q′p IS NULL ))
  56. 56 Implementation https://github.com/sqlancer

  57. 57 SQLancer Generate a Database Generate a Query Validate the

    Query’s Result
  58. 58 SQLancer Generate a Database Generate a Query Validate the

    Query’s Result Heuristic database and query generators for DuckDB, H2 and TiDB
  59. 59 SQLancer Generate a Database Generate a Query Validate the

    Query’s Result TLP test oracles
  60. 60 Evaluation: Found Bugs Closed DBMS Fixed Verified Intended Duplicate

    SQLite 4 0 0 0 MySQL 1 6 3 0 H2 16 2 0 1 CockroachDB 23 8 0 0 TiDB 26 35 0 1 DuckDB 72 0 0 2 We found 193 unique, previously unknown bugs, 142 of which have been fixed!
  61. 61 Evaluation: Found Bugs Query Partitioning Oracle DBMS WHERE Aggregate

    GROUP BY HAVING DISTINCT Error Crash SQLite 0 3 0 0 1 0 0 CockroachDB 3 3 0 1 0 22 2 TiDB 29 0 1 0 0 27 4 MySQL 7 0 0 0 0 0 0 DuckDB 21 4 1 2 1 13 19 H2 2 0 0 0 0 16 0
  62. 62 Query Partitioning Oracle DBMS WHERE Aggregate GROUP BY HAVING

    DISTINCT Error Crash SQLite 0 3 0 0 1 0 0 CockroachDB 3 3 0 1 0 22 2 TiDB 29 0 1 0 0 27 4 MySQL 7 0 0 0 0 0 0 DuckDB 21 4 1 2 1 13 19 H2 2 0 0 0 0 16 0 Evaluation: Found Bugs The WHERE oracle is the simplest, but most effective oracle
  63. 63 Query Partitioning Oracle DBMS WHERE Aggregate GROUP BY HAVING

    DISTINCT Error Crash SQLite 0 3 0 0 1 0 0 CockroachDB 3 3 0 1 0 22 2 TiDB 29 0 1 0 0 27 4 MySQL 7 0 0 0 0 0 0 DuckDB 21 4 1 2 1 13 19 H2 2 0 0 0 0 16 0 Evaluation: Found Bugs The other oracles found interesting, but fewer bugs
  64. 64 Query Partitioning Oracle DBMS WHERE Aggregate GROUP BY HAVING

    DISTINCT Error Crash SQLite 0 3 0 0 1 0 0 CockroachDB 3 3 0 1 0 22 2 TiDB 29 0 1 0 0 27 4 MySQL 7 0 0 0 0 0 0 DuckDB 21 4 1 2 1 13 19 H2 2 0 0 0 0 16 0 Evaluation: Found Bugs We tested them only for DBMSs for which the TLP WHERE oracle could not find additional bugs
  65. 65 Developer Reception: DuckDB This work is tremendously helpful for

    us, and I imagine anyone working on a DBMS. Usually these bugs would be slowly found by users over the years, not only negatively affecting the experience of those users but also requiring much more effort to debug and reproduce […].
  66. 66 Adoption

  67. 67 Adoption DuckDB runs SQLancer + TLP on every pull

    request
  68. 68 Adoption Ilya Yatsishin (Yandex) contributed a TLP implementation to

    SQLancer
  69. 69 Adoption https://www.monetdb.org/blog/faster-robuster-and-feature-richer-monetdb-in-2020-and-beyond “With the help of SQLancer, an automatic

    DBMS testing tool, we have been able to identify >100 potential problems in corner cases of the SQL processor.”
  70. 70 Adoption PingCAP implemented TLP as a tool go-sqlancer https://github.com/chaos-mesh/go-sqlancer

  71. 71 Adoption Mining for logic bugs in the Citus extension

    to Postgres with SQLancer by Nazli Ugur Koyluoglu https://www.citusdata.com/blog/2020/09/04/mining-for-logic-bugs-in-citus-with-sqlancer/
  72. 72 @RiggerManuel manuel.rigger@inf.ethz.ch Summary Goal: detect logic bugs Query Partitioning

    as a general concept TLP as a concrete partition strategy Evaluation: Close to 200 bugs in DBMSs