Slide 1

Slide 1 text

Testing Database Engines via Pivoted Query Synthesis Manuel Rigger Zhendong Su ETH Zurich, Switzerland 11/05/2020 @RiggerManuel @ast_eth https://people.inf.ethz.ch/suz/

Slide 2

Slide 2 text

2 Database Management Systems (DBMSs) PostgreSQL

Slide 3

Slide 3 text

3 Database Management Systems (DBMSs) PostgreSQL “it is seems likely that there are over one trillion (1e12) SQLite databases in active use” https://www.sqlite.org/mostdeployed.html

Slide 4

Slide 4 text

4 Database Management Systems (DBMSs) PostgreSQL We found 96 unique bugs in these DBMSs, 78 of which were fixed!

Slide 5

Slide 5 text

5 Goal: Find Logic Bugs Logic bugs: DBMS returns an incorrect result set

Slide 6

Slide 6 text

6 Example: SQLite3 Bug c0 0 1 2 NULL t0 CREATE TABLE t0(c0); CREATE INDEX i0 ON t0(1) WHERE c0 NOT NULL; INSERT INTO t0 (c0) VALUES (0), (1), (2), (NULL); SELECT c0 FROM t0 WHERE t0.c0 IS NOT 1; https://sqlite.org/src/tktview/80256748471a01

Slide 7

Slide 7 text

7 Example: SQLite3 Bug c0 0 1 2 NULL t0 CREATE TABLE t0(c0); CREATE INDEX i0 ON t0(1) WHERE c0 NOT NULL; INSERT INTO t0 (c0) VALUES (0), (1), (2), (NULL); SELECT c0 FROM t0 WHERE t0.c0 IS NOT 1; IS NOT is a “null-safe” comparison operator https://sqlite.org/src/tktview/80256748471a01

Slide 8

Slide 8 text

8 Example: SQLite3 Bug c0 0 1 2 NULL t0 CREATE TABLE t0(c0); CREATE INDEX i0 ON t0(1) WHERE c0 NOT NULL; INSERT INTO t0 (c0) VALUES (0), (1), (2), (NULL); SELECT c0 FROM t0 WHERE t0.c0 IS NOT 1; https://sqlite.org/src/tktview/80256748471a01

Slide 9

Slide 9 text

9 Example: SQLite3 Bug c0 0 1 2 NULL t0 0 CREATE TABLE t0(c0); CREATE INDEX i0 ON t0(1) WHERE c0 NOT NULL; INSERT INTO t0 (c0) VALUES (0), (1), (2), (NULL); SELECT c0 FROM t0 WHERE t0.c0 IS NOT 1; TRUE https://sqlite.org/src/tktview/80256748471a01

Slide 10

Slide 10 text

10 Example: SQLite3 Bug c0 0 1 2 NULL t0 0 CREATE TABLE t0(c0); CREATE INDEX i0 ON t0(1) WHERE c0 NOT NULL; INSERT INTO t0 (c0) VALUES (0), (1), (2), (NULL); SELECT c0 FROM t0 WHERE t0.c0 IS NOT 1; TRUE 0 https://sqlite.org/src/tktview/80256748471a01

Slide 11

Slide 11 text

11 Example: SQLite3 Bug c0 0 1 2 NULL t0 1 CREATE TABLE t0(c0); CREATE INDEX i0 ON t0(1) WHERE c0 NOT NULL; INSERT INTO t0 (c0) VALUES (0), (1), (2), (NULL); SELECT c0 FROM t0 WHERE t0.c0 IS NOT 1; 0 FALSE https://sqlite.org/src/tktview/80256748471a01

Slide 12

Slide 12 text

12 Example: SQLite3 Bug c0 0 1 2 NULL t0 0 2 2 CREATE TABLE t0(c0); CREATE INDEX i0 ON t0(1) WHERE c0 NOT NULL; INSERT INTO t0 (c0) VALUES (0), (1), (2), (NULL); SELECT c0 FROM t0 WHERE t0.c0 IS NOT 1; TRUE https://sqlite.org/src/tktview/80256748471a01

Slide 13

Slide 13 text

13 Example: SQLite3 Bug c0 0 1 2 NULL t0 0 2 NULL NULL CREATE TABLE t0(c0); CREATE INDEX i0 ON t0(1) WHERE c0 NOT NULL; INSERT INTO t0 (c0) VALUES (0), (1), (2), (NULL); SELECT c0 FROM t0 WHERE t0.c0 IS NOT 1; TRUE https://sqlite.org/src/tktview/80256748471a01

Slide 14

Slide 14 text

14 Example: SQLite3 Bug c0 0 1 2 NULL t0 NULL CREATE TABLE t0(c0); CREATE INDEX i0 ON t0(1) WHERE c0 NOT NULL; INSERT INTO t0 (c0) VALUES (0), (1), (2), (NULL); SELECT c0 FROM t0 WHERE t0.c0 IS NOT 1; TRUE 0 2 NULL https://sqlite.org/src/tktview/80256748471a01

Slide 15

Slide 15 text

15 Example: SQLite3 Bug c0 0 1 2 NULL t0 NULL CREATE TABLE t0(c0); CREATE INDEX i0 ON t0(1) WHERE c0 NOT NULL; INSERT INTO t0 (c0) VALUES (0), (1), (2), (NULL); SELECT c0 FROM t0 WHERE t0.c0 IS NOT 1; TRUE 0 2 NULL 0 2 https://sqlite.org/src/tktview/80256748471a01

Slide 16

Slide 16 text

16 Example: SQLite3 Bug c0 0 1 2 NULL t0 NULL CREATE TABLE t0(c0); CREATE INDEX i0 ON t0(1) WHERE c0 NOT NULL; INSERT INTO t0 (c0) VALUES (0), (1), (2), (NULL); SELECT c0 FROM t0 WHERE t0.c0 IS NOT 1; TRUE  NULL was not contained in the result set! 0 2 NULL 0 2 https://sqlite.org/src/tktview/80256748471a01

Slide 17

Slide 17 text

17 Background: Differential Testing PostgreSQL SELECT c0 FROM t0 WHERE t0.c0 IS NOT 1; Massive Stochastic Testing of SQL by Slutz, 1998.

Slide 18

Slide 18 text

18 Background: Differential Testing PostgreSQL RS1 RS2 RS3 SELECT c0 FROM t0 WHERE t0.c0 IS NOT 1; Massive Stochastic Testing of SQL by Slutz, 1998.

Slide 19

Slide 19 text

19 Background: Differential Testing PostgreSQL RS1 RS2 RS3 SELECT c0 FROM t0 WHERE t0.c0 IS NOT 1; Check that all DBMSs compute the same result (RS1 = RS2 = RS3 ) Massive Stochastic Testing of SQL by Slutz, 1998.

Slide 20

Slide 20 text

20 Background: Differential Testing PostgreSQL RS1 RS2 RS3

Slide 21

Slide 21 text

21 Background: Differential Testing PostgreSQL RS1 RS2 RS3

Slide 22

Slide 22 text

22 Background: Differential Testing {0, 2} Syntax error Syntax error PostgreSQL

Slide 23

Slide 23 text

23 Background: Differential Testing CREATE TABLE t0(c0); CREATE INDEX i0 ON t0(1) WHERE c0 NOT NULL; INSERT INTO t0 (c0) VALUES (0), (1), (2), (3), (NULL); SELECT c0 FROM t0 WHERE t0.c0 IS NOT 1;

Slide 24

Slide 24 text

24 Background: Differential Testing CREATE TABLE t0(c0); CREATE INDEX i0 ON t0(1) WHERE c0 NOT NULL; INSERT INTO t0 (c0) VALUES (0), (1), (2), (3), (NULL); SELECT c0 FROM t0 WHERE t0.c0 IS NOT 1; MySQL and PostgreSQL require a data type definition

Slide 25

Slide 25 text

25 Background: Differential Testing CREATE TABLE t0(c0); CREATE INDEX i0 ON t0(1) WHERE c0 NOT NULL; INSERT INTO t0 (c0) VALUES (0), (1), (2), (3), (NULL); SELECT c0 FROM t0 WHERE t0.c0 IS NOT 1; PostgreSQL provides an IS DISTINCT FROM operator, and MySQL a <=> null-safe comparison operator

Slide 26

Slide 26 text

26 Idea: PQS Pivoted Query Synthesis (PQS): Divide-and-conquer approach for testing DBMSs

Slide 27

Slide 27 text

27 PQS Idea c0 0 1 2 NULL t0 CREATE TABLE t0(c0); CREATE INDEX i0 ON t0(1) WHERE c0 NOT NULL; INSERT INTO t0 (c0) VALUES (0), (1), (2), (3), (NULL); SELECT c0 FROM t0 WHERE t0.c0 IS NOT 1; Validate the result set based on one randomly-selected row

Slide 28

Slide 28 text

28 PQS Idea c0 0 1 2 NULL t0 CREATE TABLE t0(c0); CREATE INDEX i0 ON t0(1) WHERE c0 NOT NULL; INSERT INTO t0 (c0) VALUES (0), (1), (2), (3), (NULL); SELECT c0 FROM t0 WHERE t0.c0 IS NOT 1; Pivot row Validate the result set based on one randomly-selected row

Slide 29

Slide 29 text

29 PQS Idea c0 0 1 2 NULL t0 CREATE TABLE t0(c0); CREATE INDEX i0 ON t0(1) WHERE c0 NOT NULL; INSERT INTO t0 (c0) VALUES (0), (1), (2), (3), (NULL); SELECT c0 FROM t0 WHERE t0.c0 IS NOT 1; Generate a query that is guaranteed to at least fetch the pivot row NULL TRUE

Slide 30

Slide 30 text

30 PQS Idea c0 0 1 2 NULL t0 CREATE TABLE t0(c0); CREATE INDEX i0 ON t0(1) WHERE c0 NOT NULL; INSERT INTO t0 (c0) VALUES (0), (1), (2), (3), (NULL); SELECT c0 FROM t0 WHERE t0.c0 IS NOT 1;  If the pivot row is missing from the result set a bug has been detected 0 2

Slide 31

Slide 31 text

31 Approach Randomly generate database Select pivot row Generate query for the pivot row Validate that the pivot row is contained

Slide 32

Slide 32 text

32 Approach Randomly generate database Select pivot row Generate query for the pivot row Validate that the pivot row is contained CREATE TABLE t0(c0); CREATE INDEX i0 ON t0(1) WHERE c0 NOT NULL; INSERT INTO t0 (c0) VALUES (0), (1), (2), (3), (NULL); Statements are heuristically generated based on the DBMS’ SQL dialect

Slide 33

Slide 33 text

33 Approach Randomly generate database Select pivot row Generate query for the pivot row Validate that the pivot row is contained One random row from multiple tables and views

Slide 34

Slide 34 text

34 Approach Randomly generate database Select pivot row Generate query for the pivot row Validate that the pivot row is contained Generate predicatesthat evaluate to TRUE for the pivot row and use them in JOIN and WHERE clauses SELECT c0 FROM t0 WHERE

Slide 35

Slide 35 text

35 Random Expression Generation t0.c0 IS NOT 1; Randomly generate database Select pivot row Generate query for the pivot row Validate that the pivot row is contained IS NOT t0.c0 1

Slide 36

Slide 36 text

36 Random Expression Generation t0.c0 IS NOT 1; We implemented an expression evaluator for each node Randomly generate database Select pivot row Generate query for the pivot row Validate that the pivot row is contained IS NOT t0.c0 1

Slide 37

Slide 37 text

37 Random Expression Generation c0 0 1 2 NULL t0 Evaluate the tree based on the pivot row Randomly generate database Select pivot row Generate query for the pivot row Validate that the pivot row is contained IS NOT t0.c0 1

Slide 38

Slide 38 text

38 Random Expression Generation Column references return the values from the pivot row c0 0 1 2 NULL t0 Randomly generate database Select pivot row Generate query for the pivot row Validate that the pivot row is contained IS NOT t0.c0 1

Slide 39

Slide 39 text

39 Random Expression Generation Column references return the values from the pivot row c0 0 1 2 NULL t0 Randomly generate database Select pivot row Generate query for the pivot row Validate that the pivot row is contained IS NOT t0.c0 1 NULL

Slide 40

Slide 40 text

40 Random Expression Generation Constant nodes return their assigned literal values c0 0 1 2 NULL t0 Randomly generate database Select pivot row Generate query for the pivot row Validate that the pivot row is contained IS NOT t0.c0 1 NULL

Slide 41

Slide 41 text

41 Random Expression Generation Constant nodes return their assigned literal values c0 0 1 2 NULL t0 Randomly generate database Select pivot row Generate query for the pivot row Validate that the pivot row is contained IS NOT t0.c0 1 NULL 1

Slide 42

Slide 42 text

42 Random Expression Generation Compound nodes compute their result based on their children TRUE c0 0 1 2 NULL t0 Randomly generate database Select pivot row Generate query for the pivot row Validate that the pivot row is contained IS NOT t0.c0 1 NULL 1

Slide 43

Slide 43 text

43 Random Expression Generation Compound nodes compute their result based on their children TRUE c0 0 1 2 NULL t0 Randomly generate database Select pivot row Generate query for the pivot row Validate that the pivot row is contained IS NOT t0.c0 1 NULL 1 TRUE

Slide 44

Slide 44 text

44 t0.c0 IS NOT 1; Query Synthesis SELECT c0 c0 FROM t0 WHERE Randomly generate database Select pivot row Generate query for the pivot row Validate that the pivot row is contained

Slide 45

Slide 45 text

45 t0.c0 IS NOT 1; Query Synthesis SELECT c0 c0 FROM t0 WHERE What if the expression does not evaluate to TRUE? Randomly generate database Select pivot row Generate query for the pivot row Validate that the pivot row is contained

Slide 46

Slide 46 text

46 Random Expression Rectification switch (result) { case TRUE: result = randexpr; case FALSE: result = NOT randexpr; case NULL: result = randexpr IS NULL; } Randomly generate database Select pivot row Generate query for the pivot row Validate that the pivot row is contained

Slide 47

Slide 47 text

47 Random Expression Rectification switch (result) { case TRUE: result = randexpr; case FALSE: result = NOT randexpr; case NULL: result = randexpr IS NULL; } Alternatively, we could validate that the pivot row is expectedly not fetched Randomly generate database Select pivot row Generate query for the pivot row Validate that the pivot row is contained

Slide 48

Slide 48 text

48 Approach SELECT (NULL) INTERSECT SELECT c0 FROM t0 WHERE NULL IS NOT 1; Rely on the DBMS to check whether the row is contained Randomly generate database Select pivot row Generate query for the pivot row Validate that the pivot row is contained

Slide 49

Slide 49 text

49 Implementation https://github.com/sqlancer

Slide 50

Slide 50 text

50 Bugs Overview DBMS Fixed Verified SQLite 64 0 MySQL 17 7 PostgreSQL 5 3

Slide 51

Slide 51 text

51 Bugs Overview DBMS Fixed Verified SQLite 64 0 MySQL 17 7 PostgreSQL 5 3 96 bugs were unique, previously unknown ones

Slide 52

Slide 52 text

52 Oracles DBMS Logic Error Crash SQLite 46 17 2 MySQL 14 10 1 PostgreSQL 1 7 1 61 were logic bugs

Slide 53

Slide 53 text

53 Discussion: Limitations • Implementation effort for complex operations • Requires understanding of the SQL semantics • Aggregate and window functions • Ordering • Duplicate rows

Slide 54

Slide 54 text

54 Discussion: Bug Importance https://www.mail-archive.com/[email protected]/msg117440.html CREATE TABLE t0 (c0); CREATE TABLE t1 (c1); INSERT INTO t0 VALUES (1); SELECT c0 FROM t0 LEFT JOIN t1 ON c1=c0 WHERE NOT (c1 IS NOT NULL AND c1=2);

Slide 55

Slide 55 text

55 Discussion: Bug Importance This is a cut-down example, right ? You can't possibly mean to do that WHERE clause in production code. https://www.mail-archive.com/[email protected]/msg117440.html CREATE TABLE t0 (c0); CREATE TABLE t1 (c1); INSERT INTO t0 VALUES (1); SELECT c0 FROM t0 LEFT JOIN t1 ON c1=c0 WHERE NOT (c1 IS NOT NULL AND c1=2);

Slide 56

Slide 56 text

56 Discussion: Bug Importance I might not spell it like that myself, but a code generator would do it (and much worse!). This example was simplified from a query generated by a Django ORM queryset using .exclude(nullable_joined_table__column=1), for instance. This is a cut-down example, right ? You can't possibly mean to do that WHERE clause in production code. https://www.mail-archive.com/[email protected]/msg117440.html CREATE TABLE t0 (c0); CREATE TABLE t1 (c1); INSERT INTO t0 VALUES (1); SELECT c0 FROM t0 LEFT JOIN t1 ON c1=c0 WHERE NOT (c1 IS NOT NULL AND c1=2);

Slide 57

Slide 57 text

57 Discussion: Bug Importance I might not spell it like that myself, but a code generator would do it (and much worse!). This example was simplified from a query generated by a Django ORM queryset using .exclude(nullable_joined_table__column=1), for instance. This is a cut-down example, right ? You can't possibly mean to do that WHERE clause in production code. https://www.mail-archive.com/[email protected]/msg117440.html Even “obscure” bugs might affect users CREATE TABLE t0 (c0); CREATE TABLE t1 (c1); INSERT INTO t0 VALUES (1); SELECT c0 FROM t0 LEFT JOIN t1 ON c1=c0 WHERE NOT (c1 IS NOT NULL AND c1=2);

Slide 58

Slide 58 text

58 @RiggerManuel [email protected] Summary Goal: Detect logic bugs PQS randomly selects a pivot row Rectify a random expression Evaluation: Close to 100 bugs in DBMSs