Finding Bugs in Database Systems
via Query Partitioning
Manuel Rigger Zhendong Su
ETH Zurich, Switzerland
OOPSLA 2020
@RiggerManuel @ast_eth https://people.inf.ethz.ch/suz/
Slide 2
Slide 2 text
2
Database Management Systems (DBMSs)
PostgreSQL
Slide 3
Slide 3 text
3
PostgreSQL
Database Management Systems (DBMSs)
MySQL is the most popular
open-source DBMS
Slide 4
Slide 4 text
4
Database Management Systems (DBMSs)
PostgreSQL
TiDB and CockroachDB are highly-popular,
commercially developed NewSQL DBMSs
Slide 5
Slide 5 text
5
PostgreSQL
Database Management Systems (DBMSs)
DuckDB is an emerging embedded
DBMS for analytical query workloads
Slide 6
Slide 6 text
6
Database Management Systems (DBMSs)
PostgreSQL
We found 193 bugs in these systems,
142 of which have been fixed
Slide 7
Slide 7 text
7
Goal: Find Logic Bugs
Logic bugs: DBMS returns an
incorrect result set
Slide 8
Slide 8 text
8
Motivating Example
c0
0
c0
-0
t0 t1
SELECT * FROM t0, t1
WHERE t0.c0 = t1.c0;
?
Slide 9
Slide 9 text
9
Motivating Example
c0
0
c0
-0
t0 t1
SELECT * FROM t0, t1
WHERE t0.c0 = t1.c0;
t0.c0 t0.c1
0 -0
The DBMS is expected to fetch
the row since 0 = -0 should
evaluate to TRUE
✓
Slide 10
Slide 10 text
10
Motivating Example
c0
0
c0
-0
t0 t1
SELECT * FROM t0, t1
WHERE t0.c0 = t1.c0; t0.c0 t0.c1
The latest version of MySQL that
we tested failed to fetch the row
https://bugs.mysql.com/bug.php?id=99122
Slide 11
Slide 11 text
11
Motivating Example
c0
0
c0
-0
t0 t1
SELECT * FROM t0, t1
WHERE t0.c0 = t1.c0; t0.c0 t0.c1
Challenge: Finding a test
oracle that can automatically
validate the result set
Incorrect result!
https://bugs.mysql.com/bug.php?id=99122
15
Background: Differential Testing
DBMS-
specific SQL
Common
SQL Core
Slide 16
Slide 16 text
16
Background: Differential Testing
DBMS-
specific SQL
Common
SQL Core
“We are unable to use Postgres as an
oracle because CockroachDB has slightly
different semantics and SQL support, and
generating queries that execute identically
on both is tricky […].” – Cockroach Labs
Slide 17
Slide 17 text
17
Core Idea
Novel test oracle: Partition a query into
multiple partitioning queries, from which the
original query’s result set can be derived
Slide 18
Slide 18 text
18
Query Partitioning
Query
Generator
Q
RS(Q)
Slide 19
Slide 19 text
19
Query Partitioning
Query
Generator
Q
RS(Q)
Q denotes the (randomly-generated) original query
Slide 20
Slide 20 text
20
Query Partitioning
Query
Generator
Q
RS(Q)
RS(Q) denotes the original query’s result set
Slide 21
Slide 21 text
21
Query Partitioning
Query
Generator
Q
RS(Q)
RS(Q’1
)
RS(Q’2
)
RS(Q’3
)
RS(Q’n
)
Q’1
Q’2
Q’3
Q’n
Partition the result set
29
How to Realize This Idea?
Key challenge: find a valid partitioning
strategy that stresses the DBMS
Slide 30
Slide 30 text
30
Ternary Logic
Consider a predicate φ and a given row r.
Exactly one of the following must hold:
• φ
• NOT φ
• φ IS NULL
Slide 31
Slide 31 text
31
Ternary Logic
Consider a predicate φ and a given row r.
Exactly one of the following must hold:
• φ
• NOT φ
• φ IS NULL
Slide 32
Slide 32 text
32
Ternary Logic
Consider a predicate φ and a given row r.
Exactly one of the following must hold:
• φ
• NOT φ
• φ IS NULL
φ
NOT φ
φ IS NULL
Slide 33
Slide 33 text
33
Ternary Logic
Consider a predicate φ and a given row r.
Exactly one of the following must hold:
• φ
• NOT φ
• φ IS NULL
ternary predicate variants
Slide 34
Slide 34 text
34
Motivating Example
c0
0
c0
-0
t0 t1
SELECT * FROM t0, t1
WHERE t0.c0 = t1.c0; t0.c0 t0.c1
https://bugs.mysql.com/bug.php?id=99122
How did this insight allow
us to detect this bug?
Slide 35
Slide 35 text
35
Example: MySQL
SELECT * FROM t0, t1;
Slide 36
Slide 36 text
36
Example: MySQL
t0.c0 t0.c1
0 -0
SELECT * FROM t0, t1;
Slide 37
Slide 37 text
37
Example: MySQL
t0.c0 t0.c1
0 -0
SELECT * FROM t0, t1 WHERE t0.c0=t1.c0
UNION ALL
SELECT * FROM t0, t1 WHERE NOT (t0.c0=t1.c0)
UNION ALL
SELECT * FROM t0, t1 WHERE (t0.c0=t1.c0) IS NULL;
SELECT * FROM t0, t1;
Slide 38
Slide 38 text
38
Example: MySQL
t0.c0 t0.c1
0 -0
SELECT * FROM t0, t1 WHERE t0.c0=t1.c0
UNION ALL
SELECT * FROM t0, t1 WHERE NOT (t0.c0=t1.c0)
UNION ALL
SELECT * FROM t0, t1 WHERE (t0.c0=t1.c0) IS NULL;
SELECT * FROM t0, t1;
φ
Slide 39
Slide 39 text
39
Example: MySQL
t0.c0 t0.c1
0 -0
SELECT * FROM t0, t1 WHERE t0.c0=t1.c0
UNION ALL
SELECT * FROM t0, t1 WHERE NOT (t0.c0=t1.c0)
UNION ALL
SELECT * FROM t0, t1 WHERE (t0.c0=t1.c0) IS NULL;
SELECT * FROM t0, t1;
t0.c0 t0.c1
φ
Slide 40
Slide 40 text
40
Example: MySQL
t0.c0 t0.c1
0 -0
SELECT * FROM t0, t1 WHERE t0.c0=t1.c0
UNION ALL
SELECT * FROM t0, t1 WHERE NOT (t0.c0=t1.c0)
UNION ALL
SELECT * FROM t0, t1 WHERE (t0.c0=t1.c0) IS NULL;
SELECT * FROM t0, t1;
t0.c0 t0.c1
≠
φ
42
Ternary Logic Partitioning (TLP)
Query
Generator
Q
RS(Q) RS(Q’)
Q’p
Q’p IS NULL
♢
Q’
Q ¬p
RS(Q’p
)
RS(Q ¬p
)
RS(Q’p IS NULL
)
SELECT * FROM t0, t1 WHERE t0.c0=t1.c0
Slide 43
Slide 43 text
43
Ternary Logic Partitioning (TLP)
Query
Generator
Q
RS(Q) RS(Q’)
Q’p
Q’p IS NULL
♢
Q’
Q ¬p
RS(Q’p
)
RS(Q ¬p
)
RS(Q’p IS NULL
)
SELECT * FROM t0, t1 WHERE NOT (t0.c0=t1.c0)
Slide 44
Slide 44 text
44
Ternary Logic Partitioning (TLP)
Query
Generator
Q
RS(Q) RS(Q’)
Q’p
Q’p IS NULL
♢
Q’
Q ¬p
RS(Q’p
)
RS(Q ¬p
)
RS(Q’p IS NULL
)
SELECT * FROM t0, t1 WHERE (t0.c0=t1.c0) IS NULL;
Slide 45
Slide 45 text
45
Ternary Logic Partitioning (TLP)
Query
Generator
Q
RS(Q) RS(Q’)
Q’p
Q’p IS NULL
♢
Q’
Q ¬p
RS(Q’p
)
RS(Q ¬p
)
RS(Q’p IS NULL
)
UNION ALL
Slide 46
Slide 46 text
46
Scope
• WHERE
• GROUP BY
• HAVING
• DISTINCT queries
• Aggregate functions
Slide 47
Slide 47 text
47
Testing WHERE Clauses
Q Q’ptern
♢(Q’p
, Q’¬p
, Q’p IS NULL
)
SELECT
FROM
[]
SELECT
FROM
[]
WHERE ptern
Q′p
⊎ Q′¬p
⊎ Q′p IS NULL
Slide 48
Slide 48 text
48
Testing WHERE Clauses
Q Q’ptern
♢(Q’p
, Q’¬p
, Q’p IS NULL
)
SELECT
FROM
[]
SELECT
FROM
[]
WHERE ptern
Q′p
⊎ Q′¬p
⊎ Q′p IS NULL
The multiset addition can be
implemented using UNION ALL
Slide 49
Slide 49 text
49
Scope
• WHERE
• GROUP BY
• HAVING
• DISTINCT queries
• Aggregate functions
Slide 50
Slide 50 text
50
Testing Self-decomposable Aggregate Functions
Q Q’ptern
♢(Q’p
, Q’¬p
, Q’p IS NULL
)
SELECT MAX()
FROM
[]
SELECT MAX()
FROM
[]
WHERE ptern;
MAX(Q′p
⊎ Q′¬p
⊎ Q′p IS NULL
)
Slide 51
Slide 51 text
51
Testing Self-decomposable Aggregate Functions
Q Q’ptern
♢(Q’p
, Q’¬p
, Q’p IS NULL
)
SELECT MAX()
FROM
[]
SELECT MAX()
FROM
[]
WHERE ptern;
MAX(Q′p
⊎ Q′¬p
⊎ Q′p IS NULL
)
A partition is an intermediate
result, rather than
a subset of the result set
Slide 52
Slide 52 text
52
Bug Example: CockroachDB
SET vectorize=experimental_on;
CREATE TABLE t0(c0 INT);
CREATE TABLE t1(c0 BOOL) INTERLEAVE IN PARENT t0(rowid);
INSERT INTO t0(c0) VALUES (0);
INSERT INTO t1(rowid, c0) VALUES(0, TRUE);
Slide 53
Slide 53 text
53
Bug Example: CockroachDB
SET vectorize=experimental_on;
CREATE TABLE t0(c0 INT);
CREATE TABLE t1(c0 BOOL) INTERLEAVE IN PARENT t0(rowid);
INSERT INTO t0(c0) VALUES (0);
INSERT INTO t1(rowid, c0) VALUES(0, TRUE);
NULL
SELECT MAX(t1.rowid)
FROM t1;
Slide 54
Slide 54 text
54
Bug Example: CockroachDB
SET vectorize=experimental_on;
CREATE TABLE t0(c0 INT);
CREATE TABLE t1(c0 BOOL) INTERLEAVE IN PARENT t0(rowid);
INSERT INTO t0(c0) VALUES (0);
INSERT INTO t1(rowid, c0) VALUES(0, TRUE);
NULL 0
SELECT MAX(aggr) FROM (
SELECT MAX(t1.rowid) as aggr FROM t1 WHERE '+' >= t1.c0 UNION ALL
SELECT MAX(t1.rowid) as aggr FROM t1 WHERE NOT('+' >= t1.c0) UNION ALL
SELECT MAX(t1.rowid) as aggr FROM t1 WHERE ('+' >= t1.c0) IS NULL
);
SELECT MAX(t1.rowid)
FROM t1;
≠
Slide 55
Slide 55 text
55
Testing Decomposable Aggregate Functions
Q Q’ptern
♢(Q’p
, Q’¬p
, Q’p IS NULL
)
SELECT AVG()
FROM
[];
SELECT SUM() as s,
COUNT() as s
FROM
[];
SUM(s(Q′p
⊎ Q′¬p ⊎
Q′p IS NULL
))
SUM(c(Q′p
⊎ Q′¬p ⊎
Q′p IS NULL
))
Slide 56
Slide 56 text
56
Implementation
https://github.com/sqlancer
Slide 57
Slide 57 text
57
SQLancer
Generate a
Database
Generate a
Query
Validate the
Query’s Result
Slide 58
Slide 58 text
58
SQLancer
Generate a
Database
Generate a
Query
Validate the
Query’s Result
Heuristic database and query
generators for DuckDB, H2 and TiDB
Slide 59
Slide 59 text
59
SQLancer
Generate a
Database
Generate a
Query
Validate the
Query’s Result
TLP test oracles
Slide 60
Slide 60 text
60
Evaluation: Found Bugs
Closed
DBMS Fixed Verified Intended Duplicate
SQLite 4 0 0 0
MySQL 1 6 3 0
H2 16 2 0 1
CockroachDB 23 8 0 0
TiDB 26 35 0 1
DuckDB 72 0 0 2
We found 193 unique, previously unknown
bugs, 142 of which have been fixed!
62
Query Partitioning Oracle
DBMS WHERE Aggregate GROUP BY HAVING DISTINCT Error Crash
SQLite 0 3 0 0 1 0 0
CockroachDB 3 3 0 1 0 22 2
TiDB 29 0 1 0 0 27 4
MySQL 7 0 0 0 0 0 0
DuckDB 21 4 1 2 1 13 19
H2 2 0 0 0 0 16 0
Evaluation: Found Bugs
The WHERE oracle is the simplest, but most effective oracle
Slide 63
Slide 63 text
63
Query Partitioning Oracle
DBMS WHERE Aggregate GROUP BY HAVING DISTINCT Error Crash
SQLite 0 3 0 0 1 0 0
CockroachDB 3 3 0 1 0 22 2
TiDB 29 0 1 0 0 27 4
MySQL 7 0 0 0 0 0 0
DuckDB 21 4 1 2 1 13 19
H2 2 0 0 0 0 16 0
Evaluation: Found Bugs
The other oracles found interesting, but fewer bugs
Slide 64
Slide 64 text
64
Query Partitioning Oracle
DBMS WHERE Aggregate GROUP BY HAVING DISTINCT Error Crash
SQLite 0 3 0 0 1 0 0
CockroachDB 3 3 0 1 0 22 2
TiDB 29 0 1 0 0 27 4
MySQL 7 0 0 0 0 0 0
DuckDB 21 4 1 2 1 13 19
H2 2 0 0 0 0 16 0
Evaluation: Found Bugs
We tested them only for DBMSs for which the TLP
WHERE oracle could not find additional bugs
Slide 65
Slide 65 text
65
Developer Reception: DuckDB
This work is tremendously helpful for us, and I imagine anyone
working on a DBMS. Usually these bugs would be slowly found by
users over the years, not only negatively affecting the experience of
those users but also requiring much more effort to debug and
reproduce […].
Slide 66
Slide 66 text
66
Adoption
Slide 67
Slide 67 text
67
Adoption
DuckDB runs SQLancer + TLP on every pull request
Slide 68
Slide 68 text
68
Adoption
Ilya Yatsishin (Yandex)
contributed a TLP
implementation to SQLancer
Slide 69
Slide 69 text
69
Adoption
https://www.monetdb.org/blog/faster-robuster-and-feature-richer-monetdb-in-2020-and-beyond
“With the help of SQLancer, an automatic DBMS testing
tool, we have been able to identify >100 potential
problems in corner cases of the SQL processor.”
Slide 70
Slide 70 text
70
Adoption
PingCAP implemented
TLP as a tool go-sqlancer
https://github.com/chaos-mesh/go-sqlancer
Slide 71
Slide 71 text
71
Adoption
Mining for logic bugs in the Citus extension to Postgres with
SQLancer by Nazli Ugur Koyluoglu
https://www.citusdata.com/blog/2020/09/04/mining-for-logic-bugs-in-citus-with-sqlancer/
Slide 72
Slide 72 text
72
@RiggerManuel [email protected]
Summary
Goal: detect logic bugs Query Partitioning as a general concept
TLP as a concrete partition strategy Evaluation: Close to 200 bugs in DBMSs