Slide 1

Slide 1 text

XML, HSTORE, JSON, JSONB—OH MY! David E. Wheeler @theory iovation ! PGCon 2014-05-23

Slide 2

Slide 2 text

Unstructured Data Types Several supported XML, HSTORE, JSON, JSONB Which should use use? When should you use it? Why? A brief tour

Slide 3

Slide 3 text

“It has been said that XML is like violence; if a little doesn’t solve the problem, use more.” — Chris Maden

Slide 4

Slide 4 text

XML

Slide 5

Slide 5 text

XML Added in 8.2

Slide 6

Slide 6 text

XML Added in 8.2 Data Type

Slide 7

Slide 7 text

XML Added in 8.2 Data Type Publishing

Slide 8

Slide 8 text

XML Added in 8.2 Data Type Publishing Export

Slide 9

Slide 9 text

XML Added in 8.2 Data Type Publishing Export SQL:2003 Conformance

Slide 10

Slide 10 text

XML Added in 8.2 Data Type Publishing Export SQL:2003 Conformance XPath

Slide 11

Slide 11 text

XML Implementation

Slide 12

Slide 12 text

XML Implementation Input validation

Slide 13

Slide 13 text

XML Implementation Input validation Text storage

Slide 14

Slide 14 text

XML Implementation Input validation Text storage No comparison operators

Slide 15

Slide 15 text

XML Implementation Input validation Text storage No comparison operators So no indexing

Slide 16

Slide 16 text

XML Implementation Input validation Text storage No comparison operators So no indexing Can cast to text

Slide 17

Slide 17 text

XML Implementation Input validation Text storage No comparison operators So no indexing Can cast to text Better: Use XPath

Slide 18

Slide 18 text

XML Generation

Slide 19

Slide 19 text

XML Generation xmlelement()

Slide 20

Slide 20 text

XML Generation xmlelement() xmlattributes()

Slide 21

Slide 21 text

XML Generation xmlelement() xmlattributes() xmlcomment() xmlconcat()

Slide 22

Slide 22 text

XML Generation xmlelement() xmlattributes() xmlcomment() xmlconcat() % SELECT xmlelement(name foo, xmlattributes('xyz' as bar), xmlelement(name abc), xmlcomment('ow'), xmlelement(name xyz, xmlelement(name yack, 'barber') ) ); xmlelement -------------------------------------------------------------------- barber

Slide 23

Slide 23 text

XML Predicates

Slide 24

Slide 24 text

XML Predicates IS DOCUMENT

Slide 25

Slide 25 text

XML Predicates IS DOCUMENT xml_is_well_formed()

Slide 26

Slide 26 text

XML Predicates IS DOCUMENT xml_is_well_formed() XMLEXISTS()

Slide 27

Slide 27 text

XML Predicates IS DOCUMENT xml_is_well_formed() XMLEXISTS() % SELECT xmlexists( $$//town[text() = 'Ottawa']$$ PASSING 'PortlandOttawa' ); xmlexists ----------- t XPath!

Slide 28

Slide 28 text

XPath

Slide 29

Slide 29 text

XPath % SELECT xpath( '/p/a/text()', '

testme

' ); xpath ----------- {test,me} %

Slide 30

Slide 30 text

XPath % SELECT xpath( '/p/a/text()', '

testme

' ); xpath ----------- {test,me} % SELECT xpath_exists( '/p/a/text()', '

testme

' ); xpath_exists -------------- t

Slide 31

Slide 31 text

XPath % SELECT xpath( '/p/a/text()', '

testme

' ); xpath ----------- {test,me} % SELECT xpath_exists( '/p/a/text()', '

testme

' ); xpath_exists -------------- t Supports namespacing.

Slide 32

Slide 32 text

XML Table Mapping

Slide 33

Slide 33 text

XML Table Mapping % SELECT query_to_xml($$ SELECT n.nspname, c.relname FROM pg_class c JOIN pg_namespace n ON c.relnamespace = n.oid WHERE n.nspname NOT IN ('pg_catalog', 'pg_toast', 'information_schema') $$, true, false, ''); query_to_xml --------------------------------------------------------------- + + public + stuff + + + public + stuff_pkey + + + public + idx_stuff_somekey + + +

Slide 34

Slide 34 text

Querying

Slide 35

Slide 35 text

Querying % \set xpath '\'/us-patent-grant/us-bibliographic-data-grant/examiners/*/last-name/text()\''

Slide 36

Slide 36 text

Querying % \set xpath '\'/us-patent-grant/us-bibliographic-data-grant/examiners/*/last-name/text()\'' % SELECT COUNT(*) FROM grants WHERE '{Brooks,Mroczka}' = xpath(:xpath, xmldoc)::text[]; count ------- 4 (1 row) Time: 37311.163 ms

Slide 37

Slide 37 text

Querying % \set xpath '\'/us-patent-grant/us-bibliographic-data-grant/examiners/*/last-name/text()\'' % SELECT COUNT(*) FROM grants WHERE '{Brooks,Mroczka}' = xpath(:xpath, xmldoc)::text[]; count ------- 4 (1 row) Time: 37311.163 ms A bit slow.

Slide 38

Slide 38 text

Index Query

Slide 39

Slide 39 text

Index Query % CREATE INDEX idx_grant_examiners ON grants(CAST(xpath(:xpath, xmldoc) AS text[])); CREATE INDEX %

Slide 40

Slide 40 text

Index Query % CREATE INDEX idx_grant_examiners ON grants(CAST(xpath(:xpath, xmldoc) AS text[])); CREATE INDEX % SELECT COUNT(*) FROM grants WHERE '{Brooks,Mroczka}' = xpath(:xpath, xmldoc)::text[]; count ------- 4 (1 row) Time: 0.716 ms Nice.

Slide 41

Slide 41 text

Query Index Values

Slide 42

Slide 42 text

Query Index Values % SELECT COUNT(*) FROM grants WHERE 'Brooks' = ANY(xpath(:xpath, xmldoc)::text[]); count ------- 34 (1 row) Time: 39348.995 ms Table scan

Slide 43

Slide 43 text

Query Index Values % SELECT COUNT(*) FROM grants WHERE 'Brooks' = ANY(xpath(:xpath, xmldoc)::text[]); count ------- 34 (1 row) Time: 39348.995 ms Yikes!

Slide 44

Slide 44 text

Querying

Slide 45

Slide 45 text

Querying % \set xpath '\'/us-patent-grant/us-bibliographic-data-grant/examiners/primary-examiner/last-name/text()\''

Slide 46

Slide 46 text

Querying % \set xpath '\'/us-patent-grant/us-bibliographic-data-grant/examiners/primary-examiner/last-name/text()\'' % SELECT COUNT(*) FROM grants WHERE CAST((xpath(:xpath, xmldoc))[1] AS text) = 'Brooks'; count ------- 12 (1 row) Time: 37187.900 ms

Slide 47

Slide 47 text

Index Query

Slide 48

Slide 48 text

Index Query % CREATE INDEX idx_grant_primary_examiner ON grants(CAST((xpath(:xpath, xmldoc))[1] AS text)); CREATE INDEX %

Slide 49

Slide 49 text

Index Query % CREATE INDEX idx_grant_primary_examiner ON grants(CAST((xpath(:xpath, xmldoc))[1] AS text)); CREATE INDEX % SELECT count(*) FROM grants WHERE CAST((xpath(:xpath, xmldoc))[1] AS text) = 'Brooks'; count ------- 12 (1 row) Time: 1.046 ms

Slide 50

Slide 50 text

Index Query % CREATE INDEX idx_grant_primary_examiner ON grants(CAST((xpath(:xpath, xmldoc))[1] AS text)); CREATE INDEX % SELECT count(*) FROM grants WHERE CAST((xpath(:xpath, xmldoc))[1] AS text) = 'Brooks'; count ------- 12 (1 row) Time: 1.046 ms Scalar

Slide 51

Slide 51 text

Index Query % CREATE INDEX idx_grant_primary_examiner ON grants(CAST((xpath(:xpath, xmldoc))[1] AS text)); CREATE INDEX % SELECT count(*) FROM grants WHERE CAST((xpath(:xpath, xmldoc))[1] AS text) = 'Brooks'; count ------- 12 (1 row) Time: 1.046 ms Mo betta.

Slide 52

Slide 52 text

When to use XML

Slide 53

Slide 53 text

When to use XML When XML required:

Slide 54

Slide 54 text

When to use XML When XML required: Existing documents

Slide 55

Slide 55 text

When to use XML When XML required: Existing documents SOAP API

Slide 56

Slide 56 text

When to use XML When XML required: Existing documents SOAP API XHTML

Slide 57

Slide 57 text

When to use XML When XML required: Existing documents SOAP API XHTML Good document storage

Slide 58

Slide 58 text

When to use XML When XML required: Existing documents SOAP API XHTML Good document storage Stored as text

Slide 59

Slide 59 text

When to use XML When XML required: Existing documents SOAP API XHTML Good document storage Stored as text Fast I/O

Slide 60

Slide 60 text

When to use XML When XML required: Existing documents SOAP API XHTML Good document storage Stored as text Fast I/O XPath is Awesome

Slide 61

Slide 61 text

When to use XML When XML required: Existing documents SOAP API XHTML Good document storage Stored as text Fast I/O XPath is Awesome Best on indexed scalars

Slide 62

Slide 62 text

When to use XML When XML required: Existing documents SOAP API XHTML Good document storage Stored as text Fast I/O XPath is Awesome Best on indexed scalars When table scans okay

Slide 63

Slide 63 text

When to use XML When XML required: Existing documents SOAP API XHTML Good document storage Stored as text Fast I/O XPath is Awesome Best on indexed scalars When table scans okay Encumbers per-row parsing overhead

Slide 64

Slide 64 text

—Theory “HSTORE: Like Perl hashes, but flatter, less typed, and incompatible.”

Slide 65

Slide 65 text

HSTORE

Slide 66

Slide 66 text

HSTORE Also added in 8.2

Slide 67

Slide 67 text

HSTORE Also added in 8.2 Simple key/value pairs

Slide 68

Slide 68 text

HSTORE Also added in 8.2 Simple key/value pairs Strings only

Slide 69

Slide 69 text

HSTORE Also added in 8.2 Simple key/value pairs Strings only No nested values

Slide 70

Slide 70 text

HSTORE Also added in 8.2 Simple key/value pairs Strings only No nested values Lots of useful operators

Slide 71

Slide 71 text

HSTORE Also added in 8.2 Simple key/value pairs Strings only No nested values Lots of useful operators GiST and GIN indexing

Slide 72

Slide 72 text

HSTORE Improvements

Slide 73

Slide 73 text

HSTORE Improvements Improved in 9.0:

Slide 74

Slide 74 text

HSTORE Improvements Improved in 9.0: Many new operators

Slide 75

Slide 75 text

HSTORE Improvements Improved in 9.0: Many new operators BTree and Hash indexing

Slide 76

Slide 76 text

HSTORE Improvements Improved in 9.0: Many new operators BTree and Hash indexing Increased capacity

Slide 77

Slide 77 text

HSTORE Syntax

Slide 78

Slide 78 text

HSTORE Syntax % CREATE EXTENSION hstore; CREATE EXTENSION %

Slide 79

Slide 79 text

HSTORE Syntax % CREATE EXTENSION hstore; CREATE EXTENSION % SELECT 'a => 1, a => 2'::hstore; hstore ---------- "a"=>"1" %

Slide 80

Slide 80 text

HSTORE Syntax % CREATE EXTENSION hstore; CREATE EXTENSION % SELECT 'a => 1, a => 2'::hstore; hstore ---------- "a"=>"1" % SELECT '"hey there" => NULL, "salad" => "super"'::hstore; hstore ------------------------------------- "salad"=>"super", "hey there"=>NULL %

Slide 81

Slide 81 text

HSTORE Operators

Slide 82

Slide 82 text

HSTORE Operators % SELECT 'user => "fred", id => 1'::hstore -> 'user' AS user; user ------ fred % Value for key.

Slide 83

Slide 83 text

HSTORE Operators % SELECT 'user => "fred", id => 1'::hstore -> 'user' AS user; user ------ fred % SELECT 'user => "fred", id => 1'::hstore -> ARRAY['user','id'] AS vals; vals ---------- {fred,1} % Value for keys.

Slide 84

Slide 84 text

HSTORE Operators % SELECT 'user => "fred", id => 1'::hstore -> 'user' AS user; user ------ fred % SELECT 'user => "fred", id => 1'::hstore -> ARRAY['user','id'] AS vals; vals ---------- {fred,1} % SELECT 'user => "fred", id => 1'::hstore @> 'id=>1' AS one; one ----- t Does left contain right?

Slide 85

Slide 85 text

More HSTORE Operators

Slide 86

Slide 86 text

More HSTORE Operators || Concatenation

Slide 87

Slide 87 text

More HSTORE Operators || Concatenation ? Key exists

Slide 88

Slide 88 text

More HSTORE Operators || Concatenation ? Key exists ?& Keys exist

Slide 89

Slide 89 text

More HSTORE Operators || Concatenation ? Key exists ?& Keys exist ?| Do any keys exist

Slide 90

Slide 90 text

More HSTORE Operators || Concatenation ? Key exists ?& Keys exist ?| Do any keys exist - Delete key or keys

Slide 91

Slide 91 text

More HSTORE Operators || Concatenation ? Key exists ?& Keys exist ?| Do any keys exist - Delete key or keys %% Convert to array

Slide 92

Slide 92 text

HSTORE Functions

Slide 93

Slide 93 text

% SELECT hstore('a', 'b'); hstore ---------- "a"=>"b" % HSTORE Functions HSTORE constructor.

Slide 94

Slide 94 text

% SELECT hstore('a', 'b'); hstore ---------- "a"=>"b" % HSTORE Functions SELECT hstore(ROW(1, 2)); hstore ---------------------- "f1"=>"1", "f2"=>"2" % Convert row to HSTORE.

Slide 95

Slide 95 text

% SELECT hstore('a', 'b'); hstore ---------- "a"=>"b" % HSTORE Functions SELECT hstore(ROW(1, 2)); hstore ---------------------- "f1"=>"1", "f2"=>"2" % SELECT hstore(ARRAY['a','1','b','2']); hstore -------------------- "a"=>"1", "b"=>"2" % Convert array to HSTORE.

Slide 96

Slide 96 text

HSTORE Functions

Slide 97

Slide 97 text

% SELECT akeys('a => 1, b => 2'); akeys ------- {a,b} % HSTORE Functions

Slide 98

Slide 98 text

% SELECT akeys('a => 1, b => 2'); akeys ------- {a,b} % HSTORE Functions SELECT avals('a => 1, b => 2'); avals ------- {1,2} %

Slide 99

Slide 99 text

% SELECT akeys('a => 1, b => 2'); akeys ------- {a,b} % HSTORE Functions SELECT avals('a => 1, b => 2'); avals ------- {1,2} % SELECT hstore_to_json('"a key" => 1, b => t, c => null'); hstore_to_json ------------------------------------- {"b": "t", "c": null, "a key": "1"} %

Slide 100

Slide 100 text

HSTORE Sets

Slide 101

Slide 101 text

HSTORE Sets % SELECT skeys('"a key" => 1, b => t'); skeys ------- b a key %

Slide 102

Slide 102 text

HSTORE Sets % SELECT skeys('"a key" => 1, b => t'); skeys ------- b a key % SELECT svals('"a key" => 1, b => t'); svals ------- t 1 %

Slide 103

Slide 103 text

HSTORE Sets % SELECT skeys('"a key" => 1, b => t'); skeys ------- b a key % SELECT svals('"a key" => 1, b => t'); svals ------- t 1 % SELECT * FROM each('"a key" => 1, b => t, c => null'); key | value -------+-------- b | t c | a key | 1

Slide 104

Slide 104 text

HSTORE Performance

Slide 105

Slide 105 text

HSTORE Performance Grabbed 1998 Amazon review data

Slide 106

Slide 106 text

HSTORE Performance Grabbed 1998 Amazon review data Thanks CitusDB!

Slide 107

Slide 107 text

HSTORE Performance Grabbed 1998 Amazon review data Thanks CitusDB! Converted from nested JSON

Slide 108

Slide 108 text

HSTORE Performance Grabbed 1998 Amazon review data Thanks CitusDB! Converted from nested JSON To flattened HSTORE

Slide 109

Slide 109 text

HSTORE Load

Slide 110

Slide 110 text

HSTORE Load > createdb hreviews > psql -d hreviews -c ' CREATE EXTENSION HSTORE; CREATE TABLE reviews(review hstore); ' CREATE TABLE >

Slide 111

Slide 111 text

HSTORE Load > createdb hreviews > psql -d hreviews -c ' CREATE EXTENSION HSTORE; CREATE TABLE reviews(review hstore); ' CREATE TABLE > > time psql hreviews -c "COPY reviews FROM 'reviews.hstore'" COPY 589859 0.00s user 0.00s system 0% cpu 8.595 total

Slide 112

Slide 112 text

HSTORE Load 68,628 records/second > createdb hreviews > psql -d hreviews -c ' CREATE EXTENSION HSTORE; CREATE TABLE reviews(review hstore); ' CREATE TABLE > > time psql hreviews -c "COPY reviews FROM 'reviews.hstore'" COPY 589859 0.00s user 0.00s system 0% cpu 8.595 total

Slide 113

Slide 113 text

HSTORE Load 68,628 records/second 233 MB COPY file > createdb hreviews > psql -d hreviews -c ' CREATE EXTENSION HSTORE; CREATE TABLE reviews(review hstore); ' CREATE TABLE > > time psql hreviews -c "COPY reviews FROM 'reviews.hstore'" COPY 589859 0.00s user 0.00s system 0% cpu 8.595 total

Slide 114

Slide 114 text

HSTORE Load 68,628 records/second 233 MB COPY file 256 MB Database > createdb hreviews > psql -d hreviews -c ' CREATE EXTENSION HSTORE; CREATE TABLE reviews(review hstore); ' CREATE TABLE > > time psql hreviews -c "COPY reviews FROM 'reviews.hstore'" COPY 589859 0.00s user 0.00s system 0% cpu 8.595 total

Slide 115

Slide 115 text

HSTORE Load 68,628 records/second 233 MB COPY file 256 MB Database 9.9% storage overhead > createdb hreviews > psql -d hreviews -c ' CREATE EXTENSION HSTORE; CREATE TABLE reviews(review hstore); ' CREATE TABLE > > time psql hreviews -c "COPY reviews FROM 'reviews.hstore'" COPY 589859 0.00s user 0.00s system 0% cpu 8.595 total

Slide 116

Slide 116 text

Bucket ’O Books

Slide 117

Slide 117 text

Bucket ’O Books % SELECT width_bucket(length(review->'product_title'), 1, 50, 5) title_bkt, round(avg((review->'review_rating')::numeric), 2) review_avg, COUNT(*) FROM reviews WHERE review->'product_group' = 'DVD' GROUP BY title_bkt ORDER BY title_bkt;

Slide 118

Slide 118 text

Bucket ’O Books % SELECT width_bucket(length(review->'product_title'), 1, 50, 5) title_bkt, round(avg((review->'review_rating')::numeric), 2) review_avg, COUNT(*) FROM reviews WHERE review->'product_group' = 'DVD' GROUP BY title_bkt ORDER BY title_bkt; title_bkt | review_avg | count -----------+------------+------- 1 | 4.27 | 2646 2 | 4.44 | 4180 3 | 4.53 | 1996 4 | 4.38 | 2294 5 | 4.48 | 943 6 | 4.42 | 738 (6 rows) Time: 207.665 ms

Slide 119

Slide 119 text

Bucket ’O Books % SELECT width_bucket(length(review->'product_title'), 1, 50, 5) title_bkt, round(avg((review->'review_rating')::numeric), 2) review_avg, COUNT(*) FROM reviews WHERE review->'product_group' = 'DVD' GROUP BY title_bkt ORDER BY title_bkt; title_bkt | review_avg | count -----------+------------+------- 1 | 4.27 | 2646 2 | 4.44 | 4180 3 | 4.53 | 1996 4 | 4.38 | 2294 5 | 4.48 | 943 6 | 4.42 | 738 (6 rows) Time: 207.665 ms Could be better.

Slide 120

Slide 120 text

HSTORE GIN Indexing

Slide 121

Slide 121 text

HSTORE GIN Indexing % CREATE INDEX idx_reviews_gin ON reviews USING gin(review); CREATE INDEX Time: 227250.133 ms % Get coffee.

Slide 122

Slide 122 text

HSTORE GIN Indexing % CREATE INDEX idx_reviews_gin ON reviews USING gin(review); CREATE INDEX Time: 227250.133 ms % % SELECT width_bucket(length(review->'product_title'), 1, 50, 5) title_bkt, round(avg((review->'review_rating')::numeric), 2) review_avg, COUNT(*) FROM reviews WHERE review @> '"product_group" => "DVD"' GROUP BY title_bkt ORDER BY title_bkt; Containment.

Slide 123

Slide 123 text

HSTORE GIN Indexing % CREATE INDEX idx_reviews_gin ON reviews USING gin(review); CREATE INDEX Time: 227250.133 ms % % SELECT width_bucket(length(review->'product_title'), 1, 50, 5) title_bkt, round(avg((review->'review_rating')::numeric), 2) review_avg, COUNT(*) FROM reviews WHERE review @> '"product_group" => "DVD"' GROUP BY title_bkt ORDER BY title_bkt; title_bkt | review_avg | count -----------+------------+------- 1 | 4.27 | 2646 2 | 4.44 | 4180 3 | 4.53 | 1996 4 | 4.38 | 2294 5 | 4.48 | 943 6 | 4.42 | 738 (6 rows) Time: 28.509 ms Better.

Slide 124

Slide 124 text

HSTORE Index Sizing

Slide 125

Slide 125 text

HSTORE Index Sizing Database now 301 MB

Slide 126

Slide 126 text

HSTORE Index Sizing Database now 301 MB 17.6% Overhead for GIN index

Slide 127

Slide 127 text

HSTORE Index Sizing Database now 301 MB 17.6% Overhead for GIN index Use expression index for scalar values

Slide 128

Slide 128 text

HSTORE Index Sizing Database now 301 MB 17.6% Overhead for GIN index Use expression index for scalar values % DROP INDEX idx_reviews_gin; DROP INDEX; % CREATE INDEX idx_dvd_reviews ON reviews ((review -> 'product_group')); CREATE INDEX Time: 8081.842 ms No coffee.

Slide 129

Slide 129 text

HSTORE Index Sizing

Slide 130

Slide 130 text

HSTORE Index Sizing Database now 268 MB

Slide 131

Slide 131 text

HSTORE Index Sizing Database now 268 MB 5% Overhead for expression index

Slide 132

Slide 132 text

HSTORE Index Sizing Database now 268 MB 5% Overhead for expression index And performance?

Slide 133

Slide 133 text

Back to the Books

Slide 134

Slide 134 text

Back to the Books % SELECT width_bucket(length(review->'product_title'), 1, 50, 5) title_bkt, round(avg((review->'review_rating')::numeric), 2) review_avg, COUNT(*) FROM reviews WHERE review->'product_group' = 'DVD' GROUP BY title_bkt ORDER BY title_bkt; title_bkt | review_avg | count -----------+------------+------- 1 | 4.27 | 2646 2 | 4.44 | 4180 3 | 4.53 | 1996 4 | 4.38 | 2294 5 | 4.48 | 943 6 | 4.42 | 738 (6 rows) Time: 20.547 ms Fetch scalar again.

Slide 135

Slide 135 text

Back to the Books % SELECT width_bucket(length(review->'product_title'), 1, 50, 5) title_bkt, round(avg((review->'review_rating')::numeric), 2) review_avg, COUNT(*) FROM reviews WHERE review->'product_group' = 'DVD' GROUP BY title_bkt ORDER BY title_bkt; title_bkt | review_avg | count -----------+------------+------- 1 | 4.27 | 2646 2 | 4.44 | 4180 3 | 4.53 | 1996 4 | 4.38 | 2294 5 | 4.48 | 943 6 | 4.42 | 738 (6 rows) Time: 20.547 ms Very nice.

Slide 136

Slide 136 text

Dumping HSTORE

Slide 137

Slide 137 text

Dumping HSTORE Binary representation

Slide 138

Slide 138 text

Dumping HSTORE Binary representation Must be parsed and formatted

Slide 139

Slide 139 text

Dumping HSTORE Binary representation Must be parsed and formatted Quite fast:

Slide 140

Slide 140 text

Dumping HSTORE Binary representation Must be parsed and formatted Quite fast: % time pg_dump hreviews > /dev/null 0.30s user 0.23s system 37% cpu 1.379 total

Slide 141

Slide 141 text

When to use HSTORE

Slide 142

Slide 142 text

When to use HSTORE Fast key/value store

Slide 143

Slide 143 text

When to use HSTORE Fast key/value store Binary representation

Slide 144

Slide 144 text

When to use HSTORE Fast key/value store Binary representation Slower I/O

Slide 145

Slide 145 text

When to use HSTORE Fast key/value store Binary representation Slower I/O Faster operations

Slide 146

Slide 146 text

When to use HSTORE Fast key/value store Binary representation Slower I/O Faster operations GIN index support

Slide 147

Slide 147 text

When to use HSTORE Fast key/value store Binary representation Slower I/O Faster operations GIN index support Limited utility

Slide 148

Slide 148 text

When to use HSTORE Fast key/value store Binary representation Slower I/O Faster operations GIN index support Limited utility No nesting

Slide 149

Slide 149 text

When to use HSTORE Fast key/value store Binary representation Slower I/O Faster operations GIN index support Limited utility No nesting Strings only

Slide 150

Slide 150 text

When to use HSTORE Fast key/value store Binary representation Slower I/O Faster operations GIN index support Limited utility No nesting Strings only Custom format

Slide 151

Slide 151 text

When to use HSTORE Fast key/value store Binary representation Slower I/O Faster operations GIN index support Limited utility No nesting Strings only Custom format Requires parsing

Slide 152

Slide 152 text

—Douglas Crockford “I discovered JSON. I do not claim to have invented JSON, because it already existed in nature.”

Slide 153

Slide 153 text

JSON

Slide 154

Slide 154 text

JSON Added in 9.2

Slide 155

Slide 155 text

JSON Added in 9.2 Simple validation on input

Slide 156

Slide 156 text

JSON Added in 9.2 Simple validation on input Stored as text (like XML)

Slide 157

Slide 157 text

JSON Added in 9.2 Simple validation on input Stored as text (like XML) Uses server encoding

Slide 158

Slide 158 text

JSON Added in 9.2 Simple validation on input Stored as text (like XML) Uses server encoding Preserves key order and duplicates

Slide 159

Slide 159 text

JSON Added in 9.2 Simple validation on input Stored as text (like XML) Uses server encoding Preserves key order and duplicates Operators & Functions added in 9.3

Slide 160

Slide 160 text

JSON Added in 9.2 Simple validation on input Stored as text (like XML) Uses server encoding Preserves key order and duplicates Operators & Functions added in 9.3 Building functions in 9.4

Slide 161

Slide 161 text

JSON Operators

Slide 162

Slide 162 text

JSON Operators % SELECT '[{"a":"foo"},{"b":"bar"},{"c":"baz"}]'::json->2; ?column? ------------- {"c":"baz"} % Array lookup.

Slide 163

Slide 163 text

JSON Operators % SELECT '[{"a":"foo"},{"b":"bar"},{"c":"baz"}]'::json->2; ?column? ------------- {"c":"baz"} % SELECT '{"a": {"b":"foo"}}'::json->'a'; ?column? ------------- {"b":"foo"} % Key lookup.

Slide 164

Slide 164 text

JSON Operators % SELECT '[{"a":"foo"},{"b":"bar"},{"c":"baz"}]'::json->2; ?column? ------------- {"c":"baz"} % SELECT '{"a": {"b":"foo"}}'::json->'a'; ?column? ------------- {"b":"foo"} % SELECT '{"a":1,"b":2}'::json->>'b'; ?column? ---------- 2 % Returns text.

Slide 165

Slide 165 text

JSON Path Operators

Slide 166

Slide 166 text

JSON Path Operators % SELECT '[{"a": {"b":{"c": "foo"}}}'::json#>'{a,b}'; ?column? ------------- {"c": "foo"} % Path lookup.

Slide 167

Slide 167 text

JSON Path Operators % SELECT '[{"a": {"b":{"c": "foo"}}}'::json#>'{a,b}'; ?column? ------------- {"c": "foo"} % SELECT '{"a":[1,2,3],"b":[4,5,6]}'::json#>>'{a,2}'; ?column? ---------- 3 % Returns text.

Slide 168

Slide 168 text

JSON Constructors

Slide 169

Slide 169 text

JSON Constructors % SELECT to_json('Tom says "Hi"'::text); to_json ------------------- "Tom says \"Hi\"" % Scalars okay.

Slide 170

Slide 170 text

JSON Constructors % SELECT to_json('Tom says "Hi"'::text); to_json ------------------- "Tom says \"Hi\"" % SELECT to_json(ROW(1,2)); to_json ----------------- {"f1":1,"f2":2} % Composites objectified.

Slide 171

Slide 171 text

JSON Constructors % SELECT to_json('Tom says "Hi"'::text); to_json ------------------- "Tom says \"Hi\"" % SELECT to_json(ROW(1,2)); to_json ----------------- {"f1":1,"f2":2} % SELECT to_json(ROW(1,true,NULL,'foo')); to_json ----------------------------------------- {"f1":1,"f2":true,"f3":null,"f4":"foo"} % Data types respected.

Slide 172

Slide 172 text

JSON Constructors

Slide 173

Slide 173 text

JSON Constructors % SELECT json_build_array(1,2,'three'); json_build_array ------------------ [1, 2, "three"] % Heterogeneity okay.

Slide 174

Slide 174 text

JSON Constructors % SELECT json_build_array(1,2,'three'); json_build_array ------------------ [1, 2, "three"] % SELECT json_build_object('foo',1,'bar',true); json_build_object --------------------------- {"foo" : 1, "bar" : true} %

Slide 175

Slide 175 text

JSON Constructors % SELECT json_build_array(1,2,'three'); json_build_array ------------------ [1, 2, "three"] % SELECT json_build_object('foo',1,'bar',true); json_build_object --------------------------- {"foo" : 1, "bar" : true} % SELECT json_build_object( 'foo', 1, 'bar', json_build_array(1,2,'three') ); json_build_object -------------------------------------- {"foo" : 1, "bar" : [1, 2, "three"]} % Nesting!

Slide 176

Slide 176 text

JSON Sets

Slide 177

Slide 177 text

JSON Sets % SELECT * FROM json_each('{"a":"foo", "b":"bar"}'); key | value -----+------- a | "foo" b | "bar" % JSON Values

Slide 178

Slide 178 text

JSON Sets % SELECT * FROM json_each('{"a":"foo", "b":"bar"}'); key | value -----+------- a | "foo" b | "bar" % % SELECT * FROM json_each_text(‘{"a":"foo", "b":"bar"}'); key | value -----+------- a | foo b | bar % Text values.

Slide 179

Slide 179 text

Other JSON Functions

Slide 180

Slide 180 text

Other JSON Functions json_array_length()

Slide 181

Slide 181 text

Other JSON Functions json_array_length() json_object_keys()

Slide 182

Slide 182 text

Other JSON Functions json_array_length() json_object_keys() json_array_elements()

Slide 183

Slide 183 text

Other JSON Functions json_array_length() json_object_keys() json_array_elements() json_array_elements_text()

Slide 184

Slide 184 text

Other JSON Functions json_array_length() json_object_keys() json_array_elements() json_array_elements_text() json_typeof() json_to_record()

Slide 185

Slide 185 text

Other JSON Functions json_array_length() json_object_keys() json_array_elements() json_array_elements_text() json_typeof() json_to_record() And more!

Slide 186

Slide 186 text

JSON Performance

Slide 187

Slide 187 text

JSON Performance > createdb jreviews > psql -d jreviews -c 'CREATE TABLE reviews(review json);' CREATE TABLE >

Slide 188

Slide 188 text

JSON Performance > createdb jreviews > psql -d jreviews -c 'CREATE TABLE reviews(review json);' CREATE TABLE > time psql jreviews -c "COPY reviews FROM 'reviews.json'" COPY 589859 0.00s user 0.00s system 0% cpu 6.767 total

Slide 189

Slide 189 text

JSON Performance 86,413 records/second > createdb jreviews > psql -d jreviews -c 'CREATE TABLE reviews(review json);' CREATE TABLE > time psql jreviews -c "COPY reviews FROM 'reviews.json'" COPY 589859 0.00s user 0.00s system 0% cpu 6.767 total

Slide 190

Slide 190 text

JSON Performance 86,413 records/second 208 MB COPY file > createdb jreviews > psql -d jreviews -c 'CREATE TABLE reviews(review json);' CREATE TABLE > time psql jreviews -c "COPY reviews FROM 'reviews.json'" COPY 589859 0.00s user 0.00s system 0% cpu 6.767 total

Slide 191

Slide 191 text

JSON Performance 86,413 records/second 208 MB COPY file 240 MB Database > createdb jreviews > psql -d jreviews -c 'CREATE TABLE reviews(review json);' CREATE TABLE > time psql jreviews -c "COPY reviews FROM 'reviews.json'" COPY 589859 0.00s user 0.00s system 0% cpu 6.767 total

Slide 192

Slide 192 text

JSON Performance 86,413 records/second 208 MB COPY file 240 MB Database 15% storage overhead > createdb jreviews > psql -d jreviews -c 'CREATE TABLE reviews(review json);' CREATE TABLE > time psql jreviews -c "COPY reviews FROM 'reviews.json'" COPY 589859 0.00s user 0.00s system 0% cpu 6.767 total

Slide 193

Slide 193 text

JSON Performance 86,413 records/second 208 MB COPY file 240 MB Database 15% storage overhead Faster than HSTORE > createdb jreviews > psql -d jreviews -c 'CREATE TABLE reviews(review json);' CREATE TABLE > time psql jreviews -c "COPY reviews FROM 'reviews.json'" COPY 589859 0.00s user 0.00s system 0% cpu 6.767 total

Slide 194

Slide 194 text

JSON Performance 86,413 records/second 208 MB COPY file 240 MB Database 15% storage overhead Faster than HSTORE Slightly more overhead > createdb jreviews > psql -d jreviews -c 'CREATE TABLE reviews(review json);' CREATE TABLE > time psql jreviews -c "COPY reviews FROM 'reviews.json'" COPY 589859 0.00s user 0.00s system 0% cpu 6.767 total

Slide 195

Slide 195 text

Bucket ’O Books 2

Slide 196

Slide 196 text

Bucket ’O Books 2 % SELECT width_bucket(length(review#>>'{product,title}'), 1, 50, 5) title_bkt, round(avg((review#>>'{review,rating}')::numeric), 2) review_avg, COUNT(*) FROM reviews WHERE review#>>'{product,group}' = 'DVD' GROUP BY title_bkt ORDER BY title_bkt; Path lookup.

Slide 197

Slide 197 text

Bucket ’O Books 2 % SELECT width_bucket(length(review#>>'{product,title}'), 1, 50, 5) title_bkt, round(avg((review#>>'{review,rating}')::numeric), 2) review_avg, COUNT(*) FROM reviews WHERE review#>>'{product,group}' = 'DVD' GROUP BY title_bkt ORDER BY title_bkt; title_bkt | review_avg | count -----------+------------+------- 1 | 4.27 | 2646 2 | 4.44 | 4180 3 | 4.53 | 1996 4 | 4.38 | 2294 5 | 4.48 | 943 6 | 4.42 | 738 (6 rows) Time: 1765.824 ms Yow!

Slide 198

Slide 198 text

JSON Indexing

Slide 199

Slide 199 text

JSON Indexing Operations slower than HSTORE

Slide 200

Slide 200 text

JSON Indexing Operations slower than HSTORE Text parsed per operation

Slide 201

Slide 201 text

JSON Indexing Operations slower than HSTORE Text parsed per operation No GIN or GiST

Slide 202

Slide 202 text

JSON Indexing Operations slower than HSTORE Text parsed per operation No GIN or GiST Can use expression index

Slide 203

Slide 203 text

JSON Indexing Operations slower than HSTORE Text parsed per operation No GIN or GiST Can use expression index % CREATE INDEX idx_dvd_reviews ON reviews ((review#>>'{product,group}')); CREATE INDEX Time: 10222.241 ms

Slide 204

Slide 204 text

Back to the Books 3

Slide 205

Slide 205 text

Back to the Books 3 % SELECT width_bucket(length(review#>>'{product,title}'), 1, 50, 5) title_bkt, round(avg((review#>>'{review,rating}')::numeric), 2) review_avg, COUNT(*) FROM reviews WHERE review#>>'{product,group}' = 'DVD' GROUP BY title_bkt ORDER BY title_bkt; title_bkt | review_avg | count -----------+------------+------- 1 | 4.27 | 2646 2 | 4.44 | 4180 3 | 4.53 | 1996 4 | 4.38 | 2294 5 | 4.48 | 943 6 | 4.42 | 738 (6 rows) Time: 91.863 ms Pretty good.

Slide 206

Slide 206 text

Dumping JSON

Slide 207

Slide 207 text

Dumping JSON Stored as text

Slide 208

Slide 208 text

Dumping JSON Stored as text No parsing on output

Slide 209

Slide 209 text

Dumping JSON Stored as text No parsing on output Dumping 50-60% faster than HSTORE

Slide 210

Slide 210 text

Dumping JSON Stored as text No parsing on output Dumping 50-60% faster than HSTORE % time pg_dump jreviews > /dev/null 0.22s user 0.17s system 65% cpu 0.59 total

Slide 211

Slide 211 text

When to use JSON

Slide 212

Slide 212 text

When to use JSON Document storage

Slide 213

Slide 213 text

When to use JSON Document storage Duplicate preservation

Slide 214

Slide 214 text

When to use JSON Document storage Duplicate preservation Key order preservation

Slide 215

Slide 215 text

When to use JSON Document storage Duplicate preservation Key order preservation Good storage

Slide 216

Slide 216 text

When to use JSON Document storage Duplicate preservation Key order preservation Good storage Stored as text

Slide 217

Slide 217 text

When to use JSON Document storage Duplicate preservation Key order preservation Good storage Stored as text Fast I/O

Slide 218

Slide 218 text

When to use JSON Document storage Duplicate preservation Key order preservation Good storage Stored as text Fast I/O Operations are Awesome

Slide 219

Slide 219 text

When to use JSON Document storage Duplicate preservation Key order preservation Good storage Stored as text Fast I/O Operations are Awesome Fetch keys, paths

Slide 220

Slide 220 text

When to use JSON Document storage Duplicate preservation Key order preservation Good storage Stored as text Fast I/O Operations are Awesome Fetch keys, paths Best on indexed scalars

Slide 221

Slide 221 text

When to use JSON Document storage Duplicate preservation Key order preservation Good storage Stored as text Fast I/O Operations are Awesome Fetch keys, paths Best on indexed scalars When table scans okay

Slide 222

Slide 222 text

When to use JSON Document storage Duplicate preservation Key order preservation Good storage Stored as text Fast I/O Operations are Awesome Fetch keys, paths Best on indexed scalars When table scans okay Encumbers per-row parsing overhead

Slide 223

Slide 223 text

When to use JSON Document storage Duplicate preservation Key order preservation Good storage Stored as text Fast I/O Operations are Awesome Fetch keys, paths Best on indexed scalars When table scans okay Encumbers per-row parsing overhead Sound familiar?

Slide 224

Slide 224 text

— Mike MacCana via HN “Does this [JSONB] mean I can do Mongo-style queries, retrieving a set of documents which match particular key: value criteria, using PostgreSQL?”

Slide 225

Slide 225 text

JSONB

Slide 226

Slide 226 text

JSONB New in 9.4

Slide 227

Slide 227 text

JSONB New in 9.4 Full JSON implementation

Slide 228

Slide 228 text

JSONB New in 9.4 Full JSON implementation Uses server encoding

Slide 229

Slide 229 text

JSONB New in 9.4 Full JSON implementation Uses server encoding Inspired by HSTORE 2

Slide 230

Slide 230 text

JSONB New in 9.4 Full JSON implementation Uses server encoding Inspired by HSTORE 2 Binary storage

Slide 231

Slide 231 text

JSONB New in 9.4 Full JSON implementation Uses server encoding Inspired by HSTORE 2 Binary storage HSTORE-style query operators

Slide 232

Slide 232 text

JSONB New in 9.4 Full JSON implementation Uses server encoding Inspired by HSTORE 2 Binary storage HSTORE-style query operators No key order or duplicate preservation

Slide 233

Slide 233 text

JSONB New in 9.4 Full JSON implementation Uses server encoding Inspired by HSTORE 2 Binary storage HSTORE-style query operators No key order or duplicate preservation Fast access operations

Slide 234

Slide 234 text

JSONB New in 9.4 Full JSON implementation Uses server encoding Inspired by HSTORE 2 Binary storage HSTORE-style query operators No key order or duplicate preservation Fast access operations GIN indexing

Slide 235

Slide 235 text

JSONB Operators

Slide 236

Slide 236 text

JSONB Operators All the JSON operators, plus…

Slide 237

Slide 237 text

JSONB Operators All the JSON operators, plus… % SELECT '{"a": 1, "b": 2}'::jsonb = '{"b": 2, "a": 1}'::jsonb; ?column? ---------- t % Equality!

Slide 238

Slide 238 text

JSONB Operators All the JSON operators, plus… % SELECT '{"a": 1, "b": 2}'::jsonb = '{"b": 2, "a": 1}'::jsonb; ?column? ---------- t % SELECT '{"a":1, "b":2}'::jsonb @> '{"b":2}'::jsonb; ?column? ---------- t % Containment!

Slide 239

Slide 239 text

JSONB Operators All the JSON operators, plus… % SELECT '{"a": 1, "b": 2}'::jsonb = '{"b": 2, "a": 1}'::jsonb; ?column? ---------- t % SELECT '{"a":1, "b":2}'::jsonb @> '{"b":2}'::jsonb; ?column? ---------- t % SELECT '{"a":1, "b":2}'::json ? 'b'; ?column? ---------- t % Existence!

Slide 240

Slide 240 text

Nested JSONB Operators

Slide 241

Slide 241 text

Nested JSONB Operators % SELECT '{"a": [1,2]}'::jsonb = '{"a": [1,2]}'::jsonb; ?column? ---------- t % Nested array.

Slide 242

Slide 242 text

Nested JSONB Operators % SELECT '{"a": [1,2]}'::jsonb = '{"a": [1,2]}'::jsonb; ?column? ---------- t % SELECT '{"a": {"b": 2, "c": 3}}'::jsonb @> '{"a": {"c": 3}}'::jsonb; ?column? ---------- t % Paths including values..

Slide 243

Slide 243 text

Nested JSONB Operators % SELECT '{"a": [1,2]}'::jsonb = '{"a": [1,2]}'::jsonb; ?column? ---------- t % SELECT '{"a": {"b": 2, "c": 3}}'::jsonb @> '{"a": {"c": 3}}'::jsonb; ?column? ---------- t % SELECT '[1, 2, 3]'::jsonb @> '[3, 1]'::jsonb; ?column? ---------- t % Ordering ignored.

Slide 244

Slide 244 text

JSONB Existence

Slide 245

Slide 245 text

JSONB Existence % SELECT '{"a":1, "b":2, "c":3}'::jsonb ?| ARRAY['b', 'd']; ?column? ------------- t % Any exist?

Slide 246

Slide 246 text

JSONB Existence % SELECT '{"a":1, "b":2, "c":3}'::jsonb ?| ARRAY['b', 'd']; ?column? ------------- t % SELECT '["a", "b", "c"]'::jsonb ?& ARRAY['a', 'b']; ?column? ------------- t % All exist?

Slide 247

Slide 247 text

JSONB Performance

Slide 248

Slide 248 text

JSONB Performance > createdb breviews > psql -d breviews -c 'CREATE TABLE reviews(review jsonb);' CREATE TABLE >

Slide 249

Slide 249 text

JSONB Performance > createdb breviews > psql -d breviews -c 'CREATE TABLE reviews(review jsonb);' CREATE TABLE > time psql breviews -c "COPY reviews FROM 'reviews.json'" COPY 589859 0.00s user 0.00s system 0% cpu 9.841 total

Slide 250

Slide 250 text

JSONB Performance 59,939 records/second > createdb breviews > psql -d breviews -c 'CREATE TABLE reviews(review jsonb);' CREATE TABLE > time psql breviews -c "COPY reviews FROM 'reviews.json'" COPY 589859 0.00s user 0.00s system 0% cpu 9.841 total

Slide 251

Slide 251 text

JSONB Performance 59,939 records/second 208 MB COPY file > createdb breviews > psql -d breviews -c 'CREATE TABLE reviews(review jsonb);' CREATE TABLE > time psql breviews -c "COPY reviews FROM 'reviews.json'" COPY 589859 0.00s user 0.00s system 0% cpu 9.841 total

Slide 252

Slide 252 text

JSONB Performance 59,939 records/second 208 MB COPY file 277 MB Database > createdb breviews > psql -d breviews -c 'CREATE TABLE reviews(review jsonb);' CREATE TABLE > time psql breviews -c "COPY reviews FROM 'reviews.json'" COPY 589859 0.00s user 0.00s system 0% cpu 9.841 total

Slide 253

Slide 253 text

JSONB Performance 59,939 records/second 208 MB COPY file 277 MB Database 32.9% storage overhead > createdb breviews > psql -d breviews -c 'CREATE TABLE reviews(review jsonb);' CREATE TABLE > time psql breviews -c "COPY reviews FROM 'reviews.json'" COPY 589859 0.00s user 0.00s system 0% cpu 9.841 total

Slide 254

Slide 254 text

JSONB Performance 59,939 records/second 208 MB COPY file 277 MB Database 32.9% storage overhead Slower than JSON (86,413 r/s) > createdb breviews > psql -d breviews -c 'CREATE TABLE reviews(review jsonb);' CREATE TABLE > time psql breviews -c "COPY reviews FROM 'reviews.json'" COPY 589859 0.00s user 0.00s system 0% cpu 9.841 total

Slide 255

Slide 255 text

JSONB Performance 59,939 records/second 208 MB COPY file 277 MB Database 32.9% storage overhead Slower than JSON (86,413 r/s) Bigger than JSON (15% overhead) > createdb breviews > psql -d breviews -c 'CREATE TABLE reviews(review jsonb);' CREATE TABLE > time psql breviews -c "COPY reviews FROM 'reviews.json'" COPY 589859 0.00s user 0.00s system 0% cpu 9.841 total

Slide 256

Slide 256 text

Bucket ’O Books Redux

Slide 257

Slide 257 text

Bucket ’O Books Redux % SELECT width_bucket(length(review#>>'{product,title}'), 1, 50, 5) title_bkt, round(avg((review#>>'{review,rating}')::numeric), 2) review_avg, COUNT(*) FROM reviews WHERE review#>>'{product,group}' = 'DVD' GROUP BY title_bkt ORDER BY title_bkt; title_bkt | review_avg | count -----------+------------+------- 1 | 4.27 | 2646 2 | 4.44 | 4180 3 | 4.53 | 1996 4 | 4.38 | 2294 5 | 4.48 | 943 6 | 4.42 | 738 (6 rows) Time: 381.158 ms JSON: 1765.824 HSTORE: 207.665

Slide 258

Slide 258 text

JSONB GIN Indexing

Slide 259

Slide 259 text

JSONB GIN Indexing Supports GIN Index

Slide 260

Slide 260 text

JSONB GIN Indexing Supports GIN Index Supports @>, ?, ?& and ?| operators
 
 


Slide 261

Slide 261 text

JSONB GIN Indexing Supports GIN Index Supports @>, ?, ?& and ?| operators
 
 
 % CREATE INDEX idx_reviews_gin ON reviews USING gin(review); CREATE INDEX Time: 20296.090 ms % 10x faster than HSTORE.

Slide 262

Slide 262 text

JSONB GIN Indexing Supports GIN Index Supports @>, ?, ?& and ?| operators
 
 
 DB Size: 341 MB % CREATE INDEX idx_reviews_gin ON reviews USING gin(review); CREATE INDEX Time: 20296.090 ms %

Slide 263

Slide 263 text

JSONB GIN Indexing Supports GIN Index Supports @>, ?, ?& and ?| operators
 
 
 DB Size: 341 MB 23.14% overhead % CREATE INDEX idx_reviews_gin ON reviews USING gin(review); CREATE INDEX Time: 20296.090 ms %

Slide 264

Slide 264 text

JSONB GIN Indexing Supports GIN Index Supports @>, ?, ?& and ?| operators
 
 
 DB Size: 341 MB 23.14% overhead Was 17.6% for HSTORE % CREATE INDEX idx_reviews_gin ON reviews USING gin(review); CREATE INDEX Time: 20296.090 ms %

Slide 265

Slide 265 text

JSONB GIN Performance

Slide 266

Slide 266 text

JSONB GIN Performance % SELECT width_bucket(length(review#>>'{product,title}'), 1, 50, 5) title_bkt, round(avg((review#>>'{review,rating}')::numeric), 2) review_avg, COUNT(*) FROM reviews WHERE review @> '{"product": {"group": "DVD"}}' GROUP BY title_bkt ORDER BY title_bkt; Containment.

Slide 267

Slide 267 text

JSONB GIN Performance % SELECT width_bucket(length(review#>>'{product,title}'), 1, 50, 5) title_bkt, round(avg((review#>>'{review,rating}')::numeric), 2) review_avg, COUNT(*) FROM reviews WHERE review @> '{"product": {"group": "DVD"}}' GROUP BY title_bkt ORDER BY title_bkt; title_bkt | review_avg | count -----------+------------+------- 1 | 4.27 | 2646 2 | 4.44 | 4180 3 | 4.53 | 1996 4 | 4.38 | 2294 5 | 4.48 | 943 6 | 4.42 | 738 (6 rows) Time: 35.633 ms Pretty good.

Slide 268

Slide 268 text

JSONB GIN json_path_ops

Slide 269

Slide 269 text

JSONB GIN json_path_ops jsonb_path_ops for @> more efficient

Slide 270

Slide 270 text

JSONB GIN json_path_ops jsonb_path_ops for @> more efficient One index entry per path
 
 
 


Slide 271

Slide 271 text

JSONB GIN json_path_ops jsonb_path_ops for @> more efficient One index entry per path
 
 
 
 % DROP INDEX idx_reviews_gin; DROP INDEX CREATE INDEX idx_reviews_gin ON reviews USING gin(review jsonb_path_ops); CREATE INDEX Time: 9086.793 ms % 2x faster.

Slide 272

Slide 272 text

JSONB GIN json_path_ops jsonb_path_ops for @> more efficient One index entry per path
 
 
 
 DB Size: 323 MB % DROP INDEX idx_reviews_gin; DROP INDEX CREATE INDEX idx_reviews_gin ON reviews USING gin(review jsonb_path_ops); CREATE INDEX Time: 9086.793 ms %

Slide 273

Slide 273 text

JSONB GIN json_path_ops jsonb_path_ops for @> more efficient One index entry per path
 
 
 
 DB Size: 323 MB 16.6% overhead % DROP INDEX idx_reviews_gin; DROP INDEX CREATE INDEX idx_reviews_gin ON reviews USING gin(review jsonb_path_ops); CREATE INDEX Time: 9086.793 ms %

Slide 274

Slide 274 text

json_path_ops Performance

Slide 275

Slide 275 text

json_path_ops Performance % SELECT width_bucket(length(review#>>'{product,title}'), 1, 50, 5) title_bkt, round(avg((review#>>'{review,rating}')::numeric), 2) review_avg, COUNT(*) FROM reviews WHERE review @> '{"product": {"group": "DVD"}}' GROUP BY title_bkt ORDER BY title_bkt; title_bkt | review_avg | count -----------+------------+------- 1 | 4.27 | 2646 2 | 4.44 | 4180 3 | 4.53 | 1996 4 | 4.38 | 2294 5 | 4.48 | 943 6 | 4.42 | 738 (6 rows) Time: 29.817 ms About the same.

Slide 276

Slide 276 text

Dumping JSONB

Slide 277

Slide 277 text

Dumping JSONB Binary representation

Slide 278

Slide 278 text

Dumping JSONB Binary representation Must be parsed on output

Slide 279

Slide 279 text

Dumping JSONB Binary representation Must be parsed on output Dumping 41% slower than HSTORE

Slide 280

Slide 280 text

Dumping JSONB Binary representation Must be parsed on output Dumping 41% slower than HSTORE % time pg_dump breviews > /dev/null 0.19s user 0.16s system 17% cpu 1.95 total

Slide 281

Slide 281 text

When to use JSONB

Slide 282

Slide 282 text

When to use JSONB When on 9.4

Slide 283

Slide 283 text

When to use JSONB When on 9.4 No duplicate preservation

Slide 284

Slide 284 text

When to use JSONB When on 9.4 No duplicate preservation No key order preservation

Slide 285

Slide 285 text

When to use JSONB When on 9.4 No duplicate preservation No key order preservation Great object storage

Slide 286

Slide 286 text

When to use JSONB When on 9.4 No duplicate preservation No key order preservation Great object storage Binary representation

Slide 287

Slide 287 text

When to use JSONB When on 9.4 No duplicate preservation No key order preservation Great object storage Binary representation Efficient operators

Slide 288

Slide 288 text

When to use JSONB When on 9.4 No duplicate preservation No key order preservation Great object storage Binary representation Efficient operators Operations are Awesome

Slide 289

Slide 289 text

When to use JSONB When on 9.4 No duplicate preservation No key order preservation Great object storage Binary representation Efficient operators Operations are Awesome Fetch keys, paths

Slide 290

Slide 290 text

When to use JSONB When on 9.4 No duplicate preservation No key order preservation Great object storage Binary representation Efficient operators Operations are Awesome Fetch keys, paths GIN index-aware

Slide 291

Slide 291 text

When to use JSONB When on 9.4 No duplicate preservation No key order preservation Great object storage Binary representation Efficient operators Operations are Awesome Fetch keys, paths GIN index-aware Expression indexes with GIN

Slide 292

Slide 292 text

When to use JSONB When on 9.4 No duplicate preservation No key order preservation Great object storage Binary representation Efficient operators Operations are Awesome Fetch keys, paths GIN index-aware Expression indexes with GIN Slower I/O

Slide 293

Slide 293 text

Review

Slide 294

Slide 294 text

Review XML

Slide 295

Slide 295 text

Review XML Only when necessary

Slide 296

Slide 296 text

Review XML Only when necessary HSTORE

Slide 297

Slide 297 text

Review XML Only when necessary HSTORE Flat okay

Slide 298

Slide 298 text

Review XML Only when necessary HSTORE Flat okay Strings okay

Slide 299

Slide 299 text

Review XML Only when necessary HSTORE Flat okay Strings okay Fast operations

Slide 300

Slide 300 text

Review XML Only when necessary HSTORE Flat okay Strings okay Fast operations JSON

Slide 301

Slide 301 text

Review XML Only when necessary HSTORE Flat okay Strings okay Fast operations JSON Document storage

Slide 302

Slide 302 text

Review XML Only when necessary HSTORE Flat okay Strings okay Fast operations JSON Document storage Key preservation

Slide 303

Slide 303 text

Review XML Only when necessary HSTORE Flat okay Strings okay Fast operations JSON Document storage Key preservation JSONB

Slide 304

Slide 304 text

Review XML Only when necessary HSTORE Flat okay Strings okay Fast operations JSON Document storage Key preservation JSONB Everything else

Slide 305

Slide 305 text

—Theory “Use JSONB unless you have a very good reason not to.”

Slide 306

Slide 306 text

XML, HSTORE, JSON, JSONB—OH MY! David E. Wheeler @theory theory.so ! PGCon 2014-05-23