Upgrade to Pro — share decks privately, control downloads, hide ads and more …

XML, HSTORE, JSON, JSONB—OH MY!

XML, HSTORE, JSON, JSONB—OH MY!

PostgreSQL 9.4 adds a new unstructured data type, JSONB, a variant of JSON optimized for storing objects. JSONB complements the existing XML and JSON document objects, as well as HSTORE. Which one is right for you? We'll take a tour of the features of each, and their advantages and disadvantages for various use cases, all illustrated with real-world examples.

David E. Wheeler

May 23, 2014
Tweet

More Decks by David E. Wheeler

Other Decks in Programming

Transcript

  1. XML, HSTORE, JSON,
    JSONB—OH MY!
    David E. Wheeler
    @theory
    iovation
    !
    PGCon
    2014-05-23

    View full-size slide

  2. Unstructured Data Types
    Several supported
    XML, HSTORE, JSON, JSONB
    Which should use use?
    When should you use it?
    Why?
    A brief tour

    View full-size slide

  3. “It has been said that XML is like
    violence; if a little doesn’t solve
    the problem, use more.”
    — Chris Maden

    View full-size slide

  4. XML
    Added in 8.2

    View full-size slide

  5. XML
    Added in 8.2
    Data Type

    View full-size slide

  6. XML
    Added in 8.2
    Data Type
    Publishing

    View full-size slide

  7. XML
    Added in 8.2
    Data Type
    Publishing
    Export

    View full-size slide

  8. XML
    Added in 8.2
    Data Type
    Publishing
    Export
    SQL:2003 Conformance

    View full-size slide

  9. XML
    Added in 8.2
    Data Type
    Publishing
    Export
    SQL:2003 Conformance
    XPath

    View full-size slide

  10. XML Implementation

    View full-size slide

  11. XML Implementation
    Input validation

    View full-size slide

  12. XML Implementation
    Input validation
    Text storage

    View full-size slide

  13. XML Implementation
    Input validation
    Text storage
    No comparison operators

    View full-size slide

  14. XML Implementation
    Input validation
    Text storage
    No comparison operators
    So no indexing

    View full-size slide

  15. XML Implementation
    Input validation
    Text storage
    No comparison operators
    So no indexing
    Can cast to text

    View full-size slide

  16. XML Implementation
    Input validation
    Text storage
    No comparison operators
    So no indexing
    Can cast to text
    Better: Use XPath

    View full-size slide

  17. XML Generation

    View full-size slide

  18. XML Generation
    xmlelement()

    View full-size slide

  19. XML Generation
    xmlelement()
    xmlattributes()

    View full-size slide

  20. XML Generation
    xmlelement()
    xmlattributes()
    xmlcomment()
    xmlconcat()

    View full-size slide

  21. XML Generation
    xmlelement()
    xmlattributes()
    xmlcomment()
    xmlconcat()
    % SELECT xmlelement(name foo,
    xmlattributes('xyz' as bar),
    xmlelement(name abc),
    xmlcomment('ow'),
    xmlelement(name xyz,
    xmlelement(name yack, 'barber')
    )
    );
    xmlelement
    --------------------------------------------------------------------
    barber

    View full-size slide

  22. XML Predicates

    View full-size slide

  23. XML Predicates
    IS DOCUMENT

    View full-size slide

  24. XML Predicates
    IS DOCUMENT
    xml_is_well_formed()

    View full-size slide

  25. XML Predicates
    IS DOCUMENT
    xml_is_well_formed()
    XMLEXISTS()

    View full-size slide

  26. XML Predicates
    IS DOCUMENT
    xml_is_well_formed()
    XMLEXISTS()
    % SELECT xmlexists(
    $$//town[text() = 'Ottawa']$$
    PASSING 'PortlandOttawa'
    );
    xmlexists
    -----------
    t
    XPath!

    View full-size slide

  27. XPath
    % SELECT xpath(
    '/p/a/text()',
    'testme'
    );
    xpath
    -----------
    {test,me}
    %

    View full-size slide

  28. XPath
    % SELECT xpath(
    '/p/a/text()',
    'testme'
    );
    xpath
    -----------
    {test,me}
    % SELECT xpath_exists(
    '/p/a/text()',
    'testme'
    );
    xpath_exists
    --------------
    t

    View full-size slide

  29. XPath
    % SELECT xpath(
    '/p/a/text()',
    'testme'
    );
    xpath
    -----------
    {test,me}
    % SELECT xpath_exists(
    '/p/a/text()',
    'testme'
    );
    xpath_exists
    --------------
    t
    Supports
    namespacing.

    View full-size slide

  30. XML Table Mapping

    View full-size slide

  31. XML Table Mapping
    % SELECT query_to_xml($$
    SELECT n.nspname, c.relname
    FROM pg_class c JOIN pg_namespace n
    ON c.relnamespace = n.oid
    WHERE n.nspname NOT IN ('pg_catalog', 'pg_toast',
    'information_schema')
    $$, true, false, '');
    query_to_xml
    ---------------------------------------------------------------
    +
    +
    public +
    stuff +
    +
    +
    public +
    stuff_pkey +
    +
    +
    public +
    idx_stuff_somekey +
    +
    +

    View full-size slide

  32. Querying
    % \set xpath '\'/us-patent-grant/us-bibliographic-data-grant/examiners/*/last-name/text()\''

    View full-size slide

  33. Querying
    % \set xpath '\'/us-patent-grant/us-bibliographic-data-grant/examiners/*/last-name/text()\''
    % SELECT COUNT(*)
    FROM grants
    WHERE '{Brooks,Mroczka}' = xpath(:xpath, xmldoc)::text[];
    count
    -------
    4
    (1 row)
    Time: 37311.163 ms

    View full-size slide

  34. Querying
    % \set xpath '\'/us-patent-grant/us-bibliographic-data-grant/examiners/*/last-name/text()\''
    % SELECT COUNT(*)
    FROM grants
    WHERE '{Brooks,Mroczka}' = xpath(:xpath, xmldoc)::text[];
    count
    -------
    4
    (1 row)
    Time: 37311.163 ms A bit slow.

    View full-size slide

  35. Index Query
    % CREATE INDEX idx_grant_examiners
    ON grants(CAST(xpath(:xpath, xmldoc) AS text[]));
    CREATE INDEX
    %

    View full-size slide

  36. Index Query
    % CREATE INDEX idx_grant_examiners
    ON grants(CAST(xpath(:xpath, xmldoc) AS text[]));
    CREATE INDEX
    % SELECT COUNT(*)
    FROM grants
    WHERE '{Brooks,Mroczka}' = xpath(:xpath, xmldoc)::text[];
    count
    -------
    4
    (1 row)
    Time: 0.716 ms
    Nice.

    View full-size slide

  37. Query Index Values

    View full-size slide

  38. Query Index Values
    % SELECT COUNT(*)
    FROM grants
    WHERE 'Brooks' = ANY(xpath(:xpath, xmldoc)::text[]);
    count
    -------
    34
    (1 row)
    Time: 39348.995 ms
    Table scan

    View full-size slide

  39. Query Index Values
    % SELECT COUNT(*)
    FROM grants
    WHERE 'Brooks' = ANY(xpath(:xpath, xmldoc)::text[]);
    count
    -------
    34
    (1 row)
    Time: 39348.995 ms
    Yikes!

    View full-size slide

  40. Querying
    % \set xpath '\'/us-patent-grant/us-bibliographic-data-grant/examiners/primary-examiner/last-name/text()\''

    View full-size slide

  41. Querying
    % \set xpath '\'/us-patent-grant/us-bibliographic-data-grant/examiners/primary-examiner/last-name/text()\''
    % SELECT COUNT(*)
    FROM grants
    WHERE CAST((xpath(:xpath, xmldoc))[1] AS text) = 'Brooks';
    count
    -------
    12
    (1 row)
    Time: 37187.900 ms

    View full-size slide

  42. Index Query
    % CREATE INDEX idx_grant_primary_examiner
    ON grants(CAST((xpath(:xpath, xmldoc))[1] AS text));
    CREATE INDEX
    %

    View full-size slide

  43. Index Query
    % CREATE INDEX idx_grant_primary_examiner
    ON grants(CAST((xpath(:xpath, xmldoc))[1] AS text));
    CREATE INDEX
    % SELECT count(*)
    FROM grants
    WHERE CAST((xpath(:xpath, xmldoc))[1] AS text) = 'Brooks';
    count
    -------
    12
    (1 row)
    Time: 1.046 ms

    View full-size slide

  44. Index Query
    % CREATE INDEX idx_grant_primary_examiner
    ON grants(CAST((xpath(:xpath, xmldoc))[1] AS text));
    CREATE INDEX
    % SELECT count(*)
    FROM grants
    WHERE CAST((xpath(:xpath, xmldoc))[1] AS text) = 'Brooks';
    count
    -------
    12
    (1 row)
    Time: 1.046 ms
    Scalar

    View full-size slide

  45. Index Query
    % CREATE INDEX idx_grant_primary_examiner
    ON grants(CAST((xpath(:xpath, xmldoc))[1] AS text));
    CREATE INDEX
    % SELECT count(*)
    FROM grants
    WHERE CAST((xpath(:xpath, xmldoc))[1] AS text) = 'Brooks';
    count
    -------
    12
    (1 row)
    Time: 1.046 ms
    Mo betta.

    View full-size slide

  46. When to use XML

    View full-size slide

  47. When to use XML
    When XML required:

    View full-size slide

  48. When to use XML
    When XML required:
    Existing documents

    View full-size slide

  49. When to use XML
    When XML required:
    Existing documents
    SOAP API

    View full-size slide

  50. When to use XML
    When XML required:
    Existing documents
    SOAP API
    XHTML

    View full-size slide

  51. When to use XML
    When XML required:
    Existing documents
    SOAP API
    XHTML
    Good document storage

    View full-size slide

  52. When to use XML
    When XML required:
    Existing documents
    SOAP API
    XHTML
    Good document storage
    Stored as text

    View full-size slide

  53. When to use XML
    When XML required:
    Existing documents
    SOAP API
    XHTML
    Good document storage
    Stored as text
    Fast I/O

    View full-size slide

  54. When to use XML
    When XML required:
    Existing documents
    SOAP API
    XHTML
    Good document storage
    Stored as text
    Fast I/O
    XPath is Awesome

    View full-size slide

  55. When to use XML
    When XML required:
    Existing documents
    SOAP API
    XHTML
    Good document storage
    Stored as text
    Fast I/O
    XPath is Awesome
    Best on indexed
    scalars

    View full-size slide

  56. When to use XML
    When XML required:
    Existing documents
    SOAP API
    XHTML
    Good document storage
    Stored as text
    Fast I/O
    XPath is Awesome
    Best on indexed
    scalars
    When table scans okay

    View full-size slide

  57. When to use XML
    When XML required:
    Existing documents
    SOAP API
    XHTML
    Good document storage
    Stored as text
    Fast I/O
    XPath is Awesome
    Best on indexed
    scalars
    When table scans okay
    Encumbers per-row
    parsing overhead

    View full-size slide

  58. —Theory
    “HSTORE: Like Perl hashes, but
    flatter, less typed, and
    incompatible.”

    View full-size slide

  59. HSTORE
    Also added in 8.2

    View full-size slide

  60. HSTORE
    Also added in 8.2
    Simple key/value pairs

    View full-size slide

  61. HSTORE
    Also added in 8.2
    Simple key/value pairs
    Strings only

    View full-size slide

  62. HSTORE
    Also added in 8.2
    Simple key/value pairs
    Strings only
    No nested values

    View full-size slide

  63. HSTORE
    Also added in 8.2
    Simple key/value pairs
    Strings only
    No nested values
    Lots of useful operators

    View full-size slide

  64. HSTORE
    Also added in 8.2
    Simple key/value pairs
    Strings only
    No nested values
    Lots of useful operators
    GiST and GIN indexing

    View full-size slide

  65. HSTORE Improvements

    View full-size slide

  66. HSTORE Improvements
    Improved in 9.0:

    View full-size slide

  67. HSTORE Improvements
    Improved in 9.0:
    Many new operators

    View full-size slide

  68. HSTORE Improvements
    Improved in 9.0:
    Many new operators
    BTree and Hash indexing

    View full-size slide

  69. HSTORE Improvements
    Improved in 9.0:
    Many new operators
    BTree and Hash indexing
    Increased capacity

    View full-size slide

  70. HSTORE Syntax

    View full-size slide

  71. HSTORE Syntax
    % CREATE EXTENSION hstore;
    CREATE EXTENSION
    %

    View full-size slide

  72. HSTORE Syntax
    % CREATE EXTENSION hstore;
    CREATE EXTENSION
    % SELECT 'a => 1, a => 2'::hstore;
    hstore
    ----------
    "a"=>"1"
    %

    View full-size slide

  73. HSTORE Syntax
    % CREATE EXTENSION hstore;
    CREATE EXTENSION
    % SELECT 'a => 1, a => 2'::hstore;
    hstore
    ----------
    "a"=>"1"
    % SELECT '"hey there" => NULL, "salad" => "super"'::hstore;
    hstore
    -------------------------------------
    "salad"=>"super", "hey there"=>NULL
    %

    View full-size slide

  74. HSTORE Operators

    View full-size slide

  75. HSTORE Operators
    % SELECT 'user => "fred", id => 1'::hstore -> 'user' AS user;
    user
    ------
    fred
    %
    Value for
    key.

    View full-size slide

  76. HSTORE Operators
    % SELECT 'user => "fred", id => 1'::hstore -> 'user' AS user;
    user
    ------
    fred
    % SELECT 'user => "fred", id => 1'::hstore
    -> ARRAY['user','id'] AS vals;
    vals
    ----------
    {fred,1}
    %
    Value for
    keys.

    View full-size slide

  77. HSTORE Operators
    % SELECT 'user => "fred", id => 1'::hstore -> 'user' AS user;
    user
    ------
    fred
    % SELECT 'user => "fred", id => 1'::hstore
    -> ARRAY['user','id'] AS vals;
    vals
    ----------
    {fred,1}
    % SELECT 'user => "fred", id => 1'::hstore @> 'id=>1' AS one;
    one
    -----
    t Does left
    contain right?

    View full-size slide

  78. More HSTORE Operators

    View full-size slide

  79. More HSTORE Operators
    || Concatenation

    View full-size slide

  80. More HSTORE Operators
    || Concatenation
    ? Key exists

    View full-size slide

  81. More HSTORE Operators
    || Concatenation
    ? Key exists
    ?& Keys exist

    View full-size slide

  82. More HSTORE Operators
    || Concatenation
    ? Key exists
    ?& Keys exist
    ?| Do any keys exist

    View full-size slide

  83. More HSTORE Operators
    || Concatenation
    ? Key exists
    ?& Keys exist
    ?| Do any keys exist
    - Delete key or keys

    View full-size slide

  84. More HSTORE Operators
    || Concatenation
    ? Key exists
    ?& Keys exist
    ?| Do any keys exist
    - Delete key or keys
    %% Convert to array

    View full-size slide

  85. HSTORE Functions

    View full-size slide

  86. % SELECT hstore('a', 'b');
    hstore
    ----------
    "a"=>"b"
    %
    HSTORE Functions
    HSTORE
    constructor.

    View full-size slide

  87. % SELECT hstore('a', 'b');
    hstore
    ----------
    "a"=>"b"
    %
    HSTORE Functions
    SELECT hstore(ROW(1, 2));
    hstore
    ----------------------
    "f1"=>"1", "f2"=>"2"
    %
    Convert row
    to HSTORE.

    View full-size slide

  88. % SELECT hstore('a', 'b');
    hstore
    ----------
    "a"=>"b"
    %
    HSTORE Functions
    SELECT hstore(ROW(1, 2));
    hstore
    ----------------------
    "f1"=>"1", "f2"=>"2"
    % SELECT hstore(ARRAY['a','1','b','2']);
    hstore
    --------------------
    "a"=>"1", "b"=>"2"
    %
    Convert array
    to HSTORE.

    View full-size slide

  89. HSTORE Functions

    View full-size slide

  90. % SELECT akeys('a => 1, b => 2');
    akeys
    -------
    {a,b}
    %
    HSTORE Functions

    View full-size slide

  91. % SELECT akeys('a => 1, b => 2');
    akeys
    -------
    {a,b}
    %
    HSTORE Functions
    SELECT avals('a => 1, b => 2');
    avals
    -------
    {1,2}
    %

    View full-size slide

  92. % SELECT akeys('a => 1, b => 2');
    akeys
    -------
    {a,b}
    %
    HSTORE Functions
    SELECT avals('a => 1, b => 2');
    avals
    -------
    {1,2}
    % SELECT hstore_to_json('"a key" => 1, b => t, c => null');
    hstore_to_json
    -------------------------------------
    {"b": "t", "c": null, "a key": "1"}
    %

    View full-size slide

  93. HSTORE Sets
    % SELECT skeys('"a key" => 1, b => t');
    skeys
    -------
    b
    a key
    %

    View full-size slide

  94. HSTORE Sets
    % SELECT skeys('"a key" => 1, b => t');
    skeys
    -------
    b
    a key
    % SELECT svals('"a key" => 1, b => t');
    svals
    -------
    t
    1
    %

    View full-size slide

  95. HSTORE Sets
    % SELECT skeys('"a key" => 1, b => t');
    skeys
    -------
    b
    a key
    % SELECT svals('"a key" => 1, b => t');
    svals
    -------
    t
    1
    % SELECT * FROM each('"a key" => 1, b => t, c => null');
    key | value
    -------+--------
    b | t
    c |
    a key | 1

    View full-size slide

  96. HSTORE Performance

    View full-size slide

  97. HSTORE Performance
    Grabbed 1998 Amazon review data

    View full-size slide

  98. HSTORE Performance
    Grabbed 1998 Amazon review data
    Thanks CitusDB!

    View full-size slide

  99. HSTORE Performance
    Grabbed 1998 Amazon review data
    Thanks CitusDB!
    Converted from nested JSON

    View full-size slide

  100. HSTORE Performance
    Grabbed 1998 Amazon review data
    Thanks CitusDB!
    Converted from nested JSON
    To flattened HSTORE

    View full-size slide

  101. HSTORE Load
    > createdb hreviews
    > psql -d hreviews -c '
    CREATE EXTENSION HSTORE;
    CREATE TABLE reviews(review hstore);
    '
    CREATE TABLE
    >

    View full-size slide

  102. HSTORE Load
    > createdb hreviews
    > psql -d hreviews -c '
    CREATE EXTENSION HSTORE;
    CREATE TABLE reviews(review hstore);
    '
    CREATE TABLE
    >
    > time psql hreviews -c "COPY reviews FROM 'reviews.hstore'"
    COPY 589859
    0.00s user 0.00s system 0% cpu 8.595 total

    View full-size slide

  103. HSTORE Load
    68,628 records/second
    > createdb hreviews
    > psql -d hreviews -c '
    CREATE EXTENSION HSTORE;
    CREATE TABLE reviews(review hstore);
    '
    CREATE TABLE
    >
    > time psql hreviews -c "COPY reviews FROM 'reviews.hstore'"
    COPY 589859
    0.00s user 0.00s system 0% cpu 8.595 total

    View full-size slide

  104. HSTORE Load
    68,628 records/second
    233 MB COPY file
    > createdb hreviews
    > psql -d hreviews -c '
    CREATE EXTENSION HSTORE;
    CREATE TABLE reviews(review hstore);
    '
    CREATE TABLE
    >
    > time psql hreviews -c "COPY reviews FROM 'reviews.hstore'"
    COPY 589859
    0.00s user 0.00s system 0% cpu 8.595 total

    View full-size slide

  105. HSTORE Load
    68,628 records/second
    233 MB COPY file
    256 MB Database
    > createdb hreviews
    > psql -d hreviews -c '
    CREATE EXTENSION HSTORE;
    CREATE TABLE reviews(review hstore);
    '
    CREATE TABLE
    >
    > time psql hreviews -c "COPY reviews FROM 'reviews.hstore'"
    COPY 589859
    0.00s user 0.00s system 0% cpu 8.595 total

    View full-size slide

  106. HSTORE Load
    68,628 records/second
    233 MB COPY file
    256 MB Database
    9.9% storage overhead
    > createdb hreviews
    > psql -d hreviews -c '
    CREATE EXTENSION HSTORE;
    CREATE TABLE reviews(review hstore);
    '
    CREATE TABLE
    >
    > time psql hreviews -c "COPY reviews FROM 'reviews.hstore'"
    COPY 589859
    0.00s user 0.00s system 0% cpu 8.595 total

    View full-size slide

  107. Bucket ’O Books

    View full-size slide

  108. Bucket ’O Books
    % SELECT width_bucket(length(review->'product_title'), 1, 50, 5) title_bkt,
    round(avg((review->'review_rating')::numeric), 2) review_avg,
    COUNT(*)
    FROM reviews
    WHERE review->'product_group' = 'DVD'
    GROUP BY title_bkt
    ORDER BY title_bkt;

    View full-size slide

  109. Bucket ’O Books
    % SELECT width_bucket(length(review->'product_title'), 1, 50, 5) title_bkt,
    round(avg((review->'review_rating')::numeric), 2) review_avg,
    COUNT(*)
    FROM reviews
    WHERE review->'product_group' = 'DVD'
    GROUP BY title_bkt
    ORDER BY title_bkt;
    title_bkt | review_avg | count
    -----------+------------+-------
    1 | 4.27 | 2646
    2 | 4.44 | 4180
    3 | 4.53 | 1996
    4 | 4.38 | 2294
    5 | 4.48 | 943
    6 | 4.42 | 738
    (6 rows)
    Time: 207.665 ms

    View full-size slide

  110. Bucket ’O Books
    % SELECT width_bucket(length(review->'product_title'), 1, 50, 5) title_bkt,
    round(avg((review->'review_rating')::numeric), 2) review_avg,
    COUNT(*)
    FROM reviews
    WHERE review->'product_group' = 'DVD'
    GROUP BY title_bkt
    ORDER BY title_bkt;
    title_bkt | review_avg | count
    -----------+------------+-------
    1 | 4.27 | 2646
    2 | 4.44 | 4180
    3 | 4.53 | 1996
    4 | 4.38 | 2294
    5 | 4.48 | 943
    6 | 4.42 | 738
    (6 rows)
    Time: 207.665 ms
    Could be
    better.

    View full-size slide

  111. HSTORE GIN Indexing

    View full-size slide

  112. HSTORE GIN Indexing
    % CREATE INDEX idx_reviews_gin ON reviews USING gin(review);
    CREATE INDEX
    Time: 227250.133 ms
    %
    Get coffee.

    View full-size slide

  113. HSTORE GIN Indexing
    % CREATE INDEX idx_reviews_gin ON reviews USING gin(review);
    CREATE INDEX
    Time: 227250.133 ms
    %
    % SELECT width_bucket(length(review->'product_title'), 1, 50, 5) title_bkt,
    round(avg((review->'review_rating')::numeric), 2) review_avg,
    COUNT(*)
    FROM reviews
    WHERE review @> '"product_group" => "DVD"'
    GROUP BY title_bkt
    ORDER BY title_bkt; Containment.

    View full-size slide

  114. HSTORE GIN Indexing
    % CREATE INDEX idx_reviews_gin ON reviews USING gin(review);
    CREATE INDEX
    Time: 227250.133 ms
    %
    % SELECT width_bucket(length(review->'product_title'), 1, 50, 5) title_bkt,
    round(avg((review->'review_rating')::numeric), 2) review_avg,
    COUNT(*)
    FROM reviews
    WHERE review @> '"product_group" => "DVD"'
    GROUP BY title_bkt
    ORDER BY title_bkt;
    title_bkt | review_avg | count
    -----------+------------+-------
    1 | 4.27 | 2646
    2 | 4.44 | 4180
    3 | 4.53 | 1996
    4 | 4.38 | 2294
    5 | 4.48 | 943
    6 | 4.42 | 738
    (6 rows)
    Time: 28.509 ms
    Better.

    View full-size slide

  115. HSTORE Index Sizing

    View full-size slide

  116. HSTORE Index Sizing
    Database now 301 MB

    View full-size slide

  117. HSTORE Index Sizing
    Database now 301 MB
    17.6% Overhead for GIN index

    View full-size slide

  118. HSTORE Index Sizing
    Database now 301 MB
    17.6% Overhead for GIN index
    Use expression index for scalar values

    View full-size slide

  119. HSTORE Index Sizing
    Database now 301 MB
    17.6% Overhead for GIN index
    Use expression index for scalar values
    % DROP INDEX idx_reviews_gin;
    DROP INDEX;
    % CREATE INDEX idx_dvd_reviews
    ON reviews ((review -> 'product_group'));
    CREATE INDEX
    Time: 8081.842 ms No coffee.

    View full-size slide

  120. HSTORE Index Sizing

    View full-size slide

  121. HSTORE Index Sizing
    Database now 268 MB

    View full-size slide

  122. HSTORE Index Sizing
    Database now 268 MB
    5% Overhead for expression index

    View full-size slide

  123. HSTORE Index Sizing
    Database now 268 MB
    5% Overhead for expression index
    And performance?

    View full-size slide

  124. Back to the Books

    View full-size slide

  125. Back to the Books
    % SELECT width_bucket(length(review->'product_title'), 1, 50, 5) title_bkt,
    round(avg((review->'review_rating')::numeric), 2) review_avg,
    COUNT(*)
    FROM reviews
    WHERE review->'product_group' = 'DVD'
    GROUP BY title_bkt
    ORDER BY title_bkt;
    title_bkt | review_avg | count
    -----------+------------+-------
    1 | 4.27 | 2646
    2 | 4.44 | 4180
    3 | 4.53 | 1996
    4 | 4.38 | 2294
    5 | 4.48 | 943
    6 | 4.42 | 738
    (6 rows)
    Time: 20.547 ms
    Fetch scalar
    again.

    View full-size slide

  126. Back to the Books
    % SELECT width_bucket(length(review->'product_title'), 1, 50, 5) title_bkt,
    round(avg((review->'review_rating')::numeric), 2) review_avg,
    COUNT(*)
    FROM reviews
    WHERE review->'product_group' = 'DVD'
    GROUP BY title_bkt
    ORDER BY title_bkt;
    title_bkt | review_avg | count
    -----------+------------+-------
    1 | 4.27 | 2646
    2 | 4.44 | 4180
    3 | 4.53 | 1996
    4 | 4.38 | 2294
    5 | 4.48 | 943
    6 | 4.42 | 738
    (6 rows)
    Time: 20.547 ms Very nice.

    View full-size slide

  127. Dumping HSTORE

    View full-size slide

  128. Dumping HSTORE
    Binary representation

    View full-size slide

  129. Dumping HSTORE
    Binary representation
    Must be parsed and formatted

    View full-size slide

  130. Dumping HSTORE
    Binary representation
    Must be parsed and formatted
    Quite fast:

    View full-size slide

  131. Dumping HSTORE
    Binary representation
    Must be parsed and formatted
    Quite fast:
    % time pg_dump hreviews > /dev/null
    0.30s user 0.23s system 37% cpu 1.379 total

    View full-size slide

  132. When to use HSTORE

    View full-size slide

  133. When to use HSTORE
    Fast key/value store

    View full-size slide

  134. When to use HSTORE
    Fast key/value store
    Binary
    representation

    View full-size slide

  135. When to use HSTORE
    Fast key/value store
    Binary
    representation
    Slower I/O

    View full-size slide

  136. When to use HSTORE
    Fast key/value store
    Binary
    representation
    Slower I/O
    Faster operations

    View full-size slide

  137. When to use HSTORE
    Fast key/value store
    Binary
    representation
    Slower I/O
    Faster operations
    GIN index support

    View full-size slide

  138. When to use HSTORE
    Fast key/value store
    Binary
    representation
    Slower I/O
    Faster operations
    GIN index support
    Limited utility

    View full-size slide

  139. When to use HSTORE
    Fast key/value store
    Binary
    representation
    Slower I/O
    Faster operations
    GIN index support
    Limited utility
    No nesting

    View full-size slide

  140. When to use HSTORE
    Fast key/value store
    Binary
    representation
    Slower I/O
    Faster operations
    GIN index support
    Limited utility
    No nesting
    Strings only

    View full-size slide

  141. When to use HSTORE
    Fast key/value store
    Binary
    representation
    Slower I/O
    Faster operations
    GIN index support
    Limited utility
    No nesting
    Strings only
    Custom format

    View full-size slide

  142. When to use HSTORE
    Fast key/value store
    Binary
    representation
    Slower I/O
    Faster operations
    GIN index support
    Limited utility
    No nesting
    Strings only
    Custom format
    Requires parsing

    View full-size slide

  143. —Douglas Crockford
    “I discovered JSON. I do not claim
    to have invented JSON, because it
    already existed in nature.”

    View full-size slide

  144. JSON
    Added in 9.2

    View full-size slide

  145. JSON
    Added in 9.2
    Simple validation on
    input

    View full-size slide

  146. JSON
    Added in 9.2
    Simple validation on
    input
    Stored as text (like XML)

    View full-size slide

  147. JSON
    Added in 9.2
    Simple validation on
    input
    Stored as text (like XML)
    Uses server encoding

    View full-size slide

  148. JSON
    Added in 9.2
    Simple validation on
    input
    Stored as text (like XML)
    Uses server encoding
    Preserves key order and
    duplicates

    View full-size slide

  149. JSON
    Added in 9.2
    Simple validation on
    input
    Stored as text (like XML)
    Uses server encoding
    Preserves key order and
    duplicates
    Operators & Functions
    added in 9.3

    View full-size slide

  150. JSON
    Added in 9.2
    Simple validation on
    input
    Stored as text (like XML)
    Uses server encoding
    Preserves key order and
    duplicates
    Operators & Functions
    added in 9.3
    Building functions in 9.4

    View full-size slide

  151. JSON Operators

    View full-size slide

  152. JSON Operators
    % SELECT '[{"a":"foo"},{"b":"bar"},{"c":"baz"}]'::json->2;
    ?column?
    -------------
    {"c":"baz"}
    %
    Array
    lookup.

    View full-size slide

  153. JSON Operators
    % SELECT '[{"a":"foo"},{"b":"bar"},{"c":"baz"}]'::json->2;
    ?column?
    -------------
    {"c":"baz"}
    % SELECT '{"a": {"b":"foo"}}'::json->'a';
    ?column?
    -------------
    {"b":"foo"}
    %
    Key lookup.

    View full-size slide

  154. JSON Operators
    % SELECT '[{"a":"foo"},{"b":"bar"},{"c":"baz"}]'::json->2;
    ?column?
    -------------
    {"c":"baz"}
    % SELECT '{"a": {"b":"foo"}}'::json->'a';
    ?column?
    -------------
    {"b":"foo"}
    % SELECT '{"a":1,"b":2}'::json->>'b';
    ?column?
    ----------
    2
    %
    Returns text.

    View full-size slide

  155. JSON Path Operators

    View full-size slide

  156. JSON Path Operators
    % SELECT '[{"a": {"b":{"c": "foo"}}}'::json#>'{a,b}';
    ?column?
    -------------
    {"c": "foo"}
    %
    Path lookup.

    View full-size slide

  157. JSON Path Operators
    % SELECT '[{"a": {"b":{"c": "foo"}}}'::json#>'{a,b}';
    ?column?
    -------------
    {"c": "foo"}
    % SELECT '{"a":[1,2,3],"b":[4,5,6]}'::json#>>'{a,2}';
    ?column?
    ----------
    3
    %
    Returns text.

    View full-size slide

  158. JSON Constructors

    View full-size slide

  159. JSON Constructors
    % SELECT to_json('Tom says "Hi"'::text);
    to_json
    -------------------
    "Tom says \"Hi\""
    %
    Scalars okay.

    View full-size slide

  160. JSON Constructors
    % SELECT to_json('Tom says "Hi"'::text);
    to_json
    -------------------
    "Tom says \"Hi\""
    % SELECT to_json(ROW(1,2));
    to_json
    -----------------
    {"f1":1,"f2":2}
    %
    Composites
    objectified.

    View full-size slide

  161. JSON Constructors
    % SELECT to_json('Tom says "Hi"'::text);
    to_json
    -------------------
    "Tom says \"Hi\""
    % SELECT to_json(ROW(1,2));
    to_json
    -----------------
    {"f1":1,"f2":2}
    % SELECT to_json(ROW(1,true,NULL,'foo'));
    to_json
    -----------------------------------------
    {"f1":1,"f2":true,"f3":null,"f4":"foo"}
    %
    Data types
    respected.

    View full-size slide

  162. JSON Constructors

    View full-size slide

  163. JSON Constructors
    % SELECT json_build_array(1,2,'three');
    json_build_array
    ------------------
    [1, 2, "three"]
    %
    Heterogeneity
    okay.

    View full-size slide

  164. JSON Constructors
    % SELECT json_build_array(1,2,'three');
    json_build_array
    ------------------
    [1, 2, "three"]
    % SELECT json_build_object('foo',1,'bar',true);
    json_build_object
    ---------------------------
    {"foo" : 1, "bar" : true}
    %

    View full-size slide

  165. JSON Constructors
    % SELECT json_build_array(1,2,'three');
    json_build_array
    ------------------
    [1, 2, "three"]
    % SELECT json_build_object('foo',1,'bar',true);
    json_build_object
    ---------------------------
    {"foo" : 1, "bar" : true}
    % SELECT json_build_object(
    'foo', 1,
    'bar', json_build_array(1,2,'three')
    );
    json_build_object
    --------------------------------------
    {"foo" : 1, "bar" : [1, 2, "three"]}
    %
    Nesting!

    View full-size slide

  166. JSON Sets
    % SELECT * FROM json_each('{"a":"foo", "b":"bar"}');
    key | value
    -----+-------
    a | "foo"
    b | "bar"
    %
    JSON Values

    View full-size slide

  167. JSON Sets
    % SELECT * FROM json_each('{"a":"foo", "b":"bar"}');
    key | value
    -----+-------
    a | "foo"
    b | "bar"
    %
    % SELECT * FROM json_each_text(‘{"a":"foo", "b":"bar"}');
    key | value
    -----+-------
    a | foo
    b | bar
    %
    Text values.

    View full-size slide

  168. Other JSON Functions

    View full-size slide

  169. Other JSON Functions
    json_array_length()

    View full-size slide

  170. Other JSON Functions
    json_array_length()
    json_object_keys()

    View full-size slide

  171. Other JSON Functions
    json_array_length()
    json_object_keys()
    json_array_elements()

    View full-size slide

  172. Other JSON Functions
    json_array_length()
    json_object_keys()
    json_array_elements()
    json_array_elements_text()

    View full-size slide

  173. Other JSON Functions
    json_array_length()
    json_object_keys()
    json_array_elements()
    json_array_elements_text()
    json_typeof()
    json_to_record()

    View full-size slide

  174. Other JSON Functions
    json_array_length()
    json_object_keys()
    json_array_elements()
    json_array_elements_text()
    json_typeof()
    json_to_record()
    And more!

    View full-size slide

  175. JSON Performance

    View full-size slide

  176. JSON Performance
    > createdb jreviews
    > psql -d jreviews -c 'CREATE TABLE reviews(review json);'
    CREATE TABLE
    >

    View full-size slide

  177. JSON Performance
    > createdb jreviews
    > psql -d jreviews -c 'CREATE TABLE reviews(review json);'
    CREATE TABLE
    > time psql jreviews -c "COPY reviews FROM 'reviews.json'"
    COPY 589859
    0.00s user 0.00s system 0% cpu 6.767 total

    View full-size slide

  178. JSON Performance
    86,413 records/second
    > createdb jreviews
    > psql -d jreviews -c 'CREATE TABLE reviews(review json);'
    CREATE TABLE
    > time psql jreviews -c "COPY reviews FROM 'reviews.json'"
    COPY 589859
    0.00s user 0.00s system 0% cpu 6.767 total

    View full-size slide

  179. JSON Performance
    86,413 records/second
    208 MB COPY file
    > createdb jreviews
    > psql -d jreviews -c 'CREATE TABLE reviews(review json);'
    CREATE TABLE
    > time psql jreviews -c "COPY reviews FROM 'reviews.json'"
    COPY 589859
    0.00s user 0.00s system 0% cpu 6.767 total

    View full-size slide

  180. JSON Performance
    86,413 records/second
    208 MB COPY file
    240 MB Database
    > createdb jreviews
    > psql -d jreviews -c 'CREATE TABLE reviews(review json);'
    CREATE TABLE
    > time psql jreviews -c "COPY reviews FROM 'reviews.json'"
    COPY 589859
    0.00s user 0.00s system 0% cpu 6.767 total

    View full-size slide

  181. JSON Performance
    86,413 records/second
    208 MB COPY file
    240 MB Database
    15% storage overhead
    > createdb jreviews
    > psql -d jreviews -c 'CREATE TABLE reviews(review json);'
    CREATE TABLE
    > time psql jreviews -c "COPY reviews FROM 'reviews.json'"
    COPY 589859
    0.00s user 0.00s system 0% cpu 6.767 total

    View full-size slide

  182. JSON Performance
    86,413 records/second
    208 MB COPY file
    240 MB Database
    15% storage overhead
    Faster than HSTORE
    > createdb jreviews
    > psql -d jreviews -c 'CREATE TABLE reviews(review json);'
    CREATE TABLE
    > time psql jreviews -c "COPY reviews FROM 'reviews.json'"
    COPY 589859
    0.00s user 0.00s system 0% cpu 6.767 total

    View full-size slide

  183. JSON Performance
    86,413 records/second
    208 MB COPY file
    240 MB Database
    15% storage overhead
    Faster than HSTORE
    Slightly more overhead
    > createdb jreviews
    > psql -d jreviews -c 'CREATE TABLE reviews(review json);'
    CREATE TABLE
    > time psql jreviews -c "COPY reviews FROM 'reviews.json'"
    COPY 589859
    0.00s user 0.00s system 0% cpu 6.767 total

    View full-size slide

  184. Bucket ’O Books 2

    View full-size slide

  185. Bucket ’O Books 2
    % SELECT width_bucket(length(review#>>'{product,title}'), 1, 50, 5) title_bkt,
    round(avg((review#>>'{review,rating}')::numeric), 2) review_avg,
    COUNT(*)
    FROM reviews
    WHERE review#>>'{product,group}' = 'DVD'
    GROUP BY title_bkt
    ORDER BY title_bkt;
    Path lookup.

    View full-size slide

  186. Bucket ’O Books 2
    % SELECT width_bucket(length(review#>>'{product,title}'), 1, 50, 5) title_bkt,
    round(avg((review#>>'{review,rating}')::numeric), 2) review_avg,
    COUNT(*)
    FROM reviews
    WHERE review#>>'{product,group}' = 'DVD'
    GROUP BY title_bkt
    ORDER BY title_bkt;
    title_bkt | review_avg | count
    -----------+------------+-------
    1 | 4.27 | 2646
    2 | 4.44 | 4180
    3 | 4.53 | 1996
    4 | 4.38 | 2294
    5 | 4.48 | 943
    6 | 4.42 | 738
    (6 rows)
    Time: 1765.824 ms Yow!

    View full-size slide

  187. JSON Indexing

    View full-size slide

  188. JSON Indexing
    Operations slower than HSTORE

    View full-size slide

  189. JSON Indexing
    Operations slower than HSTORE
    Text parsed per operation

    View full-size slide

  190. JSON Indexing
    Operations slower than HSTORE
    Text parsed per operation
    No GIN or GiST

    View full-size slide

  191. JSON Indexing
    Operations slower than HSTORE
    Text parsed per operation
    No GIN or GiST
    Can use expression index

    View full-size slide

  192. JSON Indexing
    Operations slower than HSTORE
    Text parsed per operation
    No GIN or GiST
    Can use expression index
    % CREATE INDEX idx_dvd_reviews
    ON reviews ((review#>>'{product,group}'));
    CREATE INDEX
    Time: 10222.241 ms

    View full-size slide

  193. Back to the Books 3

    View full-size slide

  194. Back to the Books 3
    % SELECT width_bucket(length(review#>>'{product,title}'), 1, 50, 5) title_bkt,
    round(avg((review#>>'{review,rating}')::numeric), 2) review_avg,
    COUNT(*)
    FROM reviews
    WHERE review#>>'{product,group}' = 'DVD'
    GROUP BY title_bkt
    ORDER BY title_bkt;
    title_bkt | review_avg | count
    -----------+------------+-------
    1 | 4.27 | 2646
    2 | 4.44 | 4180
    3 | 4.53 | 1996
    4 | 4.38 | 2294
    5 | 4.48 | 943
    6 | 4.42 | 738
    (6 rows)
    Time: 91.863 ms Pretty good.

    View full-size slide

  195. Dumping JSON

    View full-size slide

  196. Dumping JSON
    Stored as text

    View full-size slide

  197. Dumping JSON
    Stored as text
    No parsing on output

    View full-size slide

  198. Dumping JSON
    Stored as text
    No parsing on output
    Dumping 50-60% faster than HSTORE

    View full-size slide

  199. Dumping JSON
    Stored as text
    No parsing on output
    Dumping 50-60% faster than HSTORE
    % time pg_dump jreviews > /dev/null
    0.22s user 0.17s system 65% cpu 0.59 total

    View full-size slide

  200. When to use JSON

    View full-size slide

  201. When to use JSON
    Document storage

    View full-size slide

  202. When to use JSON
    Document storage
    Duplicate preservation

    View full-size slide

  203. When to use JSON
    Document storage
    Duplicate preservation
    Key order preservation

    View full-size slide

  204. When to use JSON
    Document storage
    Duplicate preservation
    Key order preservation
    Good storage

    View full-size slide

  205. When to use JSON
    Document storage
    Duplicate preservation
    Key order preservation
    Good storage
    Stored as text

    View full-size slide

  206. When to use JSON
    Document storage
    Duplicate preservation
    Key order preservation
    Good storage
    Stored as text
    Fast I/O

    View full-size slide

  207. When to use JSON
    Document storage
    Duplicate preservation
    Key order preservation
    Good storage
    Stored as text
    Fast I/O
    Operations are Awesome

    View full-size slide

  208. When to use JSON
    Document storage
    Duplicate preservation
    Key order preservation
    Good storage
    Stored as text
    Fast I/O
    Operations are Awesome
    Fetch keys, paths

    View full-size slide

  209. When to use JSON
    Document storage
    Duplicate preservation
    Key order preservation
    Good storage
    Stored as text
    Fast I/O
    Operations are Awesome
    Fetch keys, paths
    Best on indexed
    scalars

    View full-size slide

  210. When to use JSON
    Document storage
    Duplicate preservation
    Key order preservation
    Good storage
    Stored as text
    Fast I/O
    Operations are Awesome
    Fetch keys, paths
    Best on indexed
    scalars
    When table scans okay

    View full-size slide

  211. When to use JSON
    Document storage
    Duplicate preservation
    Key order preservation
    Good storage
    Stored as text
    Fast I/O
    Operations are Awesome
    Fetch keys, paths
    Best on indexed
    scalars
    When table scans okay
    Encumbers per-row
    parsing overhead

    View full-size slide

  212. When to use JSON
    Document storage
    Duplicate preservation
    Key order preservation
    Good storage
    Stored as text
    Fast I/O
    Operations are Awesome
    Fetch keys, paths
    Best on indexed
    scalars
    When table scans okay
    Encumbers per-row
    parsing overhead
    Sound
    familiar?

    View full-size slide

  213. — Mike MacCana via HN
    “Does this [JSONB] mean I can do
    Mongo-style queries, retrieving a
    set of documents which match
    particular key: value criteria,
    using PostgreSQL?”

    View full-size slide

  214. JSONB
    New in 9.4

    View full-size slide

  215. JSONB
    New in 9.4
    Full JSON
    implementation

    View full-size slide

  216. JSONB
    New in 9.4
    Full JSON
    implementation
    Uses server encoding

    View full-size slide

  217. JSONB
    New in 9.4
    Full JSON
    implementation
    Uses server encoding
    Inspired by HSTORE 2

    View full-size slide

  218. JSONB
    New in 9.4
    Full JSON
    implementation
    Uses server encoding
    Inspired by HSTORE 2
    Binary storage

    View full-size slide

  219. JSONB
    New in 9.4
    Full JSON
    implementation
    Uses server encoding
    Inspired by HSTORE 2
    Binary storage
    HSTORE-style query
    operators

    View full-size slide

  220. JSONB
    New in 9.4
    Full JSON
    implementation
    Uses server encoding
    Inspired by HSTORE 2
    Binary storage
    HSTORE-style query
    operators
    No key order or
    duplicate preservation

    View full-size slide

  221. JSONB
    New in 9.4
    Full JSON
    implementation
    Uses server encoding
    Inspired by HSTORE 2
    Binary storage
    HSTORE-style query
    operators
    No key order or
    duplicate preservation
    Fast access operations

    View full-size slide

  222. JSONB
    New in 9.4
    Full JSON
    implementation
    Uses server encoding
    Inspired by HSTORE 2
    Binary storage
    HSTORE-style query
    operators
    No key order or
    duplicate preservation
    Fast access operations
    GIN indexing

    View full-size slide

  223. JSONB Operators

    View full-size slide

  224. JSONB Operators
    All the JSON operators, plus…

    View full-size slide

  225. JSONB Operators
    All the JSON operators, plus…
    % SELECT '{"a": 1, "b": 2}'::jsonb = '{"b": 2, "a": 1}'::jsonb;
    ?column?
    ----------
    t
    %
    Equality!

    View full-size slide

  226. JSONB Operators
    All the JSON operators, plus…
    % SELECT '{"a": 1, "b": 2}'::jsonb = '{"b": 2, "a": 1}'::jsonb;
    ?column?
    ----------
    t
    % SELECT '{"a":1, "b":2}'::jsonb @> '{"b":2}'::jsonb;
    ?column?
    ----------
    t
    %
    Containment!

    View full-size slide

  227. JSONB Operators
    All the JSON operators, plus…
    % SELECT '{"a": 1, "b": 2}'::jsonb = '{"b": 2, "a": 1}'::jsonb;
    ?column?
    ----------
    t
    % SELECT '{"a":1, "b":2}'::jsonb @> '{"b":2}'::jsonb;
    ?column?
    ----------
    t
    % SELECT '{"a":1, "b":2}'::json ? 'b';
    ?column?
    ----------
    t
    %
    Existence!

    View full-size slide

  228. Nested JSONB Operators

    View full-size slide

  229. Nested JSONB Operators
    % SELECT '{"a": [1,2]}'::jsonb = '{"a": [1,2]}'::jsonb;
    ?column?
    ----------
    t
    %
    Nested array.

    View full-size slide

  230. Nested JSONB Operators
    % SELECT '{"a": [1,2]}'::jsonb = '{"a": [1,2]}'::jsonb;
    ?column?
    ----------
    t
    % SELECT '{"a": {"b": 2, "c": 3}}'::jsonb
    @> '{"a": {"c": 3}}'::jsonb;
    ?column?
    ----------
    t
    %
    Paths including
    values..

    View full-size slide

  231. Nested JSONB Operators
    % SELECT '{"a": [1,2]}'::jsonb = '{"a": [1,2]}'::jsonb;
    ?column?
    ----------
    t
    % SELECT '{"a": {"b": 2, "c": 3}}'::jsonb
    @> '{"a": {"c": 3}}'::jsonb;
    ?column?
    ----------
    t
    % SELECT '[1, 2, 3]'::jsonb @> '[3, 1]'::jsonb;
    ?column?
    ----------
    t
    %
    Ordering ignored.

    View full-size slide

  232. JSONB Existence

    View full-size slide

  233. JSONB Existence
    % SELECT '{"a":1, "b":2, "c":3}'::jsonb ?| ARRAY['b', 'd'];
    ?column?
    -------------
    t
    %
    Any exist?

    View full-size slide

  234. JSONB Existence
    % SELECT '{"a":1, "b":2, "c":3}'::jsonb ?| ARRAY['b', 'd'];
    ?column?
    -------------
    t
    % SELECT '["a", "b", "c"]'::jsonb ?& ARRAY['a', 'b'];
    ?column?
    -------------
    t
    %
    All exist?

    View full-size slide

  235. JSONB Performance

    View full-size slide

  236. JSONB Performance
    > createdb breviews
    > psql -d breviews -c 'CREATE TABLE reviews(review jsonb);'
    CREATE TABLE
    >

    View full-size slide

  237. JSONB Performance
    > createdb breviews
    > psql -d breviews -c 'CREATE TABLE reviews(review jsonb);'
    CREATE TABLE
    > time psql breviews -c "COPY reviews FROM 'reviews.json'"
    COPY 589859
    0.00s user 0.00s system 0% cpu 9.841 total

    View full-size slide

  238. JSONB Performance
    59,939 records/second
    > createdb breviews
    > psql -d breviews -c 'CREATE TABLE reviews(review jsonb);'
    CREATE TABLE
    > time psql breviews -c "COPY reviews FROM 'reviews.json'"
    COPY 589859
    0.00s user 0.00s system 0% cpu 9.841 total

    View full-size slide

  239. JSONB Performance
    59,939 records/second
    208 MB COPY file
    > createdb breviews
    > psql -d breviews -c 'CREATE TABLE reviews(review jsonb);'
    CREATE TABLE
    > time psql breviews -c "COPY reviews FROM 'reviews.json'"
    COPY 589859
    0.00s user 0.00s system 0% cpu 9.841 total

    View full-size slide

  240. JSONB Performance
    59,939 records/second
    208 MB COPY file
    277 MB Database
    > createdb breviews
    > psql -d breviews -c 'CREATE TABLE reviews(review jsonb);'
    CREATE TABLE
    > time psql breviews -c "COPY reviews FROM 'reviews.json'"
    COPY 589859
    0.00s user 0.00s system 0% cpu 9.841 total

    View full-size slide

  241. JSONB Performance
    59,939 records/second
    208 MB COPY file
    277 MB Database
    32.9% storage overhead
    > createdb breviews
    > psql -d breviews -c 'CREATE TABLE reviews(review jsonb);'
    CREATE TABLE
    > time psql breviews -c "COPY reviews FROM 'reviews.json'"
    COPY 589859
    0.00s user 0.00s system 0% cpu 9.841 total

    View full-size slide

  242. JSONB Performance
    59,939 records/second
    208 MB COPY file
    277 MB Database
    32.9% storage overhead
    Slower than JSON (86,413 r/s)
    > createdb breviews
    > psql -d breviews -c 'CREATE TABLE reviews(review jsonb);'
    CREATE TABLE
    > time psql breviews -c "COPY reviews FROM 'reviews.json'"
    COPY 589859
    0.00s user 0.00s system 0% cpu 9.841 total

    View full-size slide

  243. JSONB Performance
    59,939 records/second
    208 MB COPY file
    277 MB Database
    32.9% storage overhead
    Slower than JSON (86,413 r/s)
    Bigger than JSON (15% overhead)
    > createdb breviews
    > psql -d breviews -c 'CREATE TABLE reviews(review jsonb);'
    CREATE TABLE
    > time psql breviews -c "COPY reviews FROM 'reviews.json'"
    COPY 589859
    0.00s user 0.00s system 0% cpu 9.841 total

    View full-size slide

  244. Bucket ’O Books Redux

    View full-size slide

  245. Bucket ’O Books Redux
    % SELECT width_bucket(length(review#>>'{product,title}'), 1, 50, 5) title_bkt,
    round(avg((review#>>'{review,rating}')::numeric), 2) review_avg,
    COUNT(*)
    FROM reviews
    WHERE review#>>'{product,group}' = 'DVD'
    GROUP BY title_bkt
    ORDER BY title_bkt;
    title_bkt | review_avg | count
    -----------+------------+-------
    1 | 4.27 | 2646
    2 | 4.44 | 4180
    3 | 4.53 | 1996
    4 | 4.38 | 2294
    5 | 4.48 | 943
    6 | 4.42 | 738
    (6 rows)
    Time: 381.158 ms
    JSON: 1765.824
    HSTORE: 207.665

    View full-size slide

  246. JSONB GIN Indexing

    View full-size slide

  247. JSONB GIN Indexing
    Supports GIN Index

    View full-size slide

  248. JSONB GIN Indexing
    Supports GIN Index
    Supports @>, ?, ?& and ?| operators



    View full-size slide

  249. JSONB GIN Indexing
    Supports GIN Index
    Supports @>, ?, ?& and ?| operators



    % CREATE INDEX idx_reviews_gin ON reviews USING gin(review);
    CREATE INDEX
    Time: 20296.090 ms
    %
    10x faster
    than HSTORE.

    View full-size slide

  250. JSONB GIN Indexing
    Supports GIN Index
    Supports @>, ?, ?& and ?| operators



    DB Size: 341 MB
    % CREATE INDEX idx_reviews_gin ON reviews USING gin(review);
    CREATE INDEX
    Time: 20296.090 ms
    %

    View full-size slide

  251. JSONB GIN Indexing
    Supports GIN Index
    Supports @>, ?, ?& and ?| operators



    DB Size: 341 MB
    23.14% overhead
    % CREATE INDEX idx_reviews_gin ON reviews USING gin(review);
    CREATE INDEX
    Time: 20296.090 ms
    %

    View full-size slide

  252. JSONB GIN Indexing
    Supports GIN Index
    Supports @>, ?, ?& and ?| operators



    DB Size: 341 MB
    23.14% overhead
    Was 17.6% for HSTORE
    % CREATE INDEX idx_reviews_gin ON reviews USING gin(review);
    CREATE INDEX
    Time: 20296.090 ms
    %

    View full-size slide

  253. JSONB GIN Performance

    View full-size slide

  254. JSONB GIN Performance
    % SELECT width_bucket(length(review#>>'{product,title}'), 1, 50, 5) title_bkt,
    round(avg((review#>>'{review,rating}')::numeric), 2) review_avg,
    COUNT(*)
    FROM reviews
    WHERE review @> '{"product": {"group": "DVD"}}'
    GROUP BY title_bkt
    ORDER BY title_bkt;
    Containment.

    View full-size slide

  255. JSONB GIN Performance
    % SELECT width_bucket(length(review#>>'{product,title}'), 1, 50, 5) title_bkt,
    round(avg((review#>>'{review,rating}')::numeric), 2) review_avg,
    COUNT(*)
    FROM reviews
    WHERE review @> '{"product": {"group": "DVD"}}'
    GROUP BY title_bkt
    ORDER BY title_bkt;
    title_bkt | review_avg | count
    -----------+------------+-------
    1 | 4.27 | 2646
    2 | 4.44 | 4180
    3 | 4.53 | 1996
    4 | 4.38 | 2294
    5 | 4.48 | 943
    6 | 4.42 | 738
    (6 rows)
    Time: 35.633 ms Pretty good.

    View full-size slide

  256. JSONB GIN json_path_ops

    View full-size slide

  257. JSONB GIN json_path_ops
    jsonb_path_ops for @> more efficient

    View full-size slide

  258. JSONB GIN json_path_ops
    jsonb_path_ops for @> more efficient
    One index entry per path




    View full-size slide

  259. JSONB GIN json_path_ops
    jsonb_path_ops for @> more efficient
    One index entry per path




    % DROP INDEX idx_reviews_gin;
    DROP INDEX
    CREATE INDEX idx_reviews_gin
    ON reviews USING gin(review jsonb_path_ops);
    CREATE INDEX
    Time: 9086.793 ms
    %
    2x faster.

    View full-size slide

  260. JSONB GIN json_path_ops
    jsonb_path_ops for @> more efficient
    One index entry per path




    DB Size: 323 MB
    % DROP INDEX idx_reviews_gin;
    DROP INDEX
    CREATE INDEX idx_reviews_gin
    ON reviews USING gin(review jsonb_path_ops);
    CREATE INDEX
    Time: 9086.793 ms
    %

    View full-size slide

  261. JSONB GIN json_path_ops
    jsonb_path_ops for @> more efficient
    One index entry per path




    DB Size: 323 MB
    16.6% overhead
    % DROP INDEX idx_reviews_gin;
    DROP INDEX
    CREATE INDEX idx_reviews_gin
    ON reviews USING gin(review jsonb_path_ops);
    CREATE INDEX
    Time: 9086.793 ms
    %

    View full-size slide

  262. json_path_ops Performance

    View full-size slide

  263. json_path_ops Performance
    % SELECT width_bucket(length(review#>>'{product,title}'), 1, 50, 5) title_bkt,
    round(avg((review#>>'{review,rating}')::numeric), 2) review_avg,
    COUNT(*)
    FROM reviews
    WHERE review @> '{"product": {"group": "DVD"}}'
    GROUP BY title_bkt
    ORDER BY title_bkt;
    title_bkt | review_avg | count
    -----------+------------+-------
    1 | 4.27 | 2646
    2 | 4.44 | 4180
    3 | 4.53 | 1996
    4 | 4.38 | 2294
    5 | 4.48 | 943
    6 | 4.42 | 738
    (6 rows)
    Time: 29.817 ms About the same.

    View full-size slide

  264. Dumping JSONB

    View full-size slide

  265. Dumping JSONB
    Binary representation

    View full-size slide

  266. Dumping JSONB
    Binary representation
    Must be parsed on output

    View full-size slide

  267. Dumping JSONB
    Binary representation
    Must be parsed on output
    Dumping 41% slower than HSTORE

    View full-size slide

  268. Dumping JSONB
    Binary representation
    Must be parsed on output
    Dumping 41% slower than HSTORE
    % time pg_dump breviews > /dev/null
    0.19s user 0.16s system 17% cpu 1.95 total

    View full-size slide

  269. When to use JSONB

    View full-size slide

  270. When to use JSONB
    When on 9.4

    View full-size slide

  271. When to use JSONB
    When on 9.4
    No duplicate preservation

    View full-size slide

  272. When to use JSONB
    When on 9.4
    No duplicate preservation
    No key order preservation

    View full-size slide

  273. When to use JSONB
    When on 9.4
    No duplicate preservation
    No key order preservation
    Great object storage

    View full-size slide

  274. When to use JSONB
    When on 9.4
    No duplicate preservation
    No key order preservation
    Great object storage
    Binary representation

    View full-size slide

  275. When to use JSONB
    When on 9.4
    No duplicate preservation
    No key order preservation
    Great object storage
    Binary representation
    Efficient operators

    View full-size slide

  276. When to use JSONB
    When on 9.4
    No duplicate preservation
    No key order preservation
    Great object storage
    Binary representation
    Efficient operators
    Operations are Awesome

    View full-size slide

  277. When to use JSONB
    When on 9.4
    No duplicate preservation
    No key order preservation
    Great object storage
    Binary representation
    Efficient operators
    Operations are Awesome
    Fetch keys, paths

    View full-size slide

  278. When to use JSONB
    When on 9.4
    No duplicate preservation
    No key order preservation
    Great object storage
    Binary representation
    Efficient operators
    Operations are Awesome
    Fetch keys, paths
    GIN index-aware

    View full-size slide

  279. When to use JSONB
    When on 9.4
    No duplicate preservation
    No key order preservation
    Great object storage
    Binary representation
    Efficient operators
    Operations are Awesome
    Fetch keys, paths
    GIN index-aware
    Expression indexes
    with GIN

    View full-size slide

  280. When to use JSONB
    When on 9.4
    No duplicate preservation
    No key order preservation
    Great object storage
    Binary representation
    Efficient operators
    Operations are Awesome
    Fetch keys, paths
    GIN index-aware
    Expression indexes
    with GIN
    Slower I/O

    View full-size slide

  281. Review
    XML
    Only when necessary

    View full-size slide

  282. Review
    XML
    Only when necessary
    HSTORE

    View full-size slide

  283. Review
    XML
    Only when necessary
    HSTORE
    Flat okay

    View full-size slide

  284. Review
    XML
    Only when necessary
    HSTORE
    Flat okay
    Strings okay

    View full-size slide

  285. Review
    XML
    Only when necessary
    HSTORE
    Flat okay
    Strings okay
    Fast operations

    View full-size slide

  286. Review
    XML
    Only when necessary
    HSTORE
    Flat okay
    Strings okay
    Fast operations
    JSON

    View full-size slide

  287. Review
    XML
    Only when necessary
    HSTORE
    Flat okay
    Strings okay
    Fast operations
    JSON
    Document storage

    View full-size slide

  288. Review
    XML
    Only when necessary
    HSTORE
    Flat okay
    Strings okay
    Fast operations
    JSON
    Document storage
    Key preservation

    View full-size slide

  289. Review
    XML
    Only when necessary
    HSTORE
    Flat okay
    Strings okay
    Fast operations
    JSON
    Document storage
    Key preservation
    JSONB

    View full-size slide

  290. Review
    XML
    Only when necessary
    HSTORE
    Flat okay
    Strings okay
    Fast operations
    JSON
    Document storage
    Key preservation
    JSONB
    Everything else

    View full-size slide

  291. —Theory
    “Use JSONB unless you have a
    very good reason not to.”

    View full-size slide

  292. XML, HSTORE, JSON,
    JSONB—OH MY!
    David E. Wheeler
    @theory
    theory.so
    !
    PGCon
    2014-05-23

    View full-size slide