Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Handling XML and JSON in the Database

Handling XML and JSON in the Database

These days you can't go far without encountering XML or JSON and in the world of the web these data types are ubiquitous. Since version 8.3 XML has been supported as a data type and JSON support was introduced in 9.2. We'll be looking at what advantages there are in storing your data with these data types and how we can query and manipulate our data once it's stored.

Mike Fowler

July 12, 2013
Tweet

More Decks by Mike Fowler

Other Decks in Technology

Transcript

  1. Handling XML and JSON in the Database Mike Fowler, [email protected]

    PGDayUK 2013 PGDay UK 2013 Handling XML and JSON in the Database Mike Fowler, [email protected] 12th July 2013
  2. Handling XML and JSON in the Database Mike Fowler, [email protected]

    PGDayUK 2013 Overview • XML – XML Primer – History of XML in PostgreSQL – Using PostgreSQL's XML features • JSON – JSON Primer – History of JSON in PostgreSQL – Using PostgreSQL's JSON features
  3. Handling XML and JSON in the Database Mike Fowler, [email protected]

    PGDayUK 2013 About Me • Been using PostgreSQL for ~10 years • Contributed some XML support – XMLEXISTS/xpath_exists() – xml_is_well_formed() • Buildfarm member piapiac – Amazon EC2 based build for JDBC driver – Has lead to a number of bugfix patches for JDBC http://www.pgbuildfarm.org/cgi-bin/show_status.pl?member=piapiac
  4. Handling XML and JSON in the Database Mike Fowler, [email protected]

    PGDayUK 2013 Reasons to store XML/JSON • Client application uses it – Configuration – Serialised objects • Data format/schema is highly complex/variable • You just don't “care” about the data! – Audit data – Application log files
  5. Handling XML and JSON in the Database Mike Fowler, [email protected]

    PGDayUK 2013 Primer: XML • eXtensible Markup Language • Human readable data interchange & serialisation format • Consists of a root element containing a mix of child elements and text content with any element having optional attributes <root> <child attribute=”value”> text based content </child> </root>
  6. Handling XML and JSON in the Database Mike Fowler, [email protected]

    PGDayUK 2013 XML Support • ANSI and ISO standards exist – Introduced in SQL/XML 2003 – Augmented in SQL/XML 2006 • Prior to 8.3 XML support was a contrib module • Added to core in 8.3 but remains a compile time option enabled with: configure --with-libxml
  7. Handling XML and JSON in the Database Mike Fowler, [email protected]

    PGDayUK 2013 XML Support • xml datatype (internally stored as text) • Keywords from the standards – DOCUMENT, CONTENT – XMLPARSE, XMLSERIALIZE ... • Predicates, also from the standards – IS [NOT] DOCUMENT – XMLEXISTS (9.1) • A number of support functions, not standard – xmlconcat(), xpath() ...
  8. Handling XML and JSON in the Database Mike Fowler, [email protected]

    PGDayUK 2013 XML: The Hard Way • Using “standard” SQL, XML is inserted: INSERT INTO demo (data) VALUES (XMLPARSE (DOCUMENT '<?xml version="1.0"?> <book> <title>Manual</title> <chapter>...</chapter> </book>')) • To retrieve using “standard” SQL: SELECT XMLSERIALIZE(DOCUMENT data AS text) FROM ...
  9. Handling XML and JSON in the Database Mike Fowler, [email protected]

    PGDayUK 2013 XML: The Easy Way • It's a normal datatype, use normal casting! • INSERT (::xml is optional as text will implicitly cast to xml even in 9.3) pgday=# INSERT INTO demo (data) VALUES ('<?xml version="1.0"?> pgday-# <book> <title>Manual</title> pgday-# <chapter>...</chapter> </book>'::xml); INSERT 0 1
  10. Handling XML and JSON in the Database Mike Fowler, [email protected]

    PGDayUK 2013 XML: The Easy Way • SELECT (::text is optional as far as rendering in the psql client is concerned) pgday=# SELECT data::text FROM demo WHERE ... data ---------------------------------------------------------- <book><title>Manual</title><chapter>...</chapter></book> (1 row)
  11. Handling XML and JSON in the Database Mike Fowler, [email protected]

    PGDayUK 2013 XML: xmloption • When casting without XMLPARSE or XMLSERIALIZE the choice of DOCUMENT or CONTENT is determined by the value of the 'XML option' session variable SET XML OPTION { DOCUMENT | CONTENT} SET xmloption TO { DOCUMENT | CONTENT}
  12. Handling XML and JSON in the Database Mike Fowler, [email protected]

    PGDayUK 2013 XML: Predicates • IS DOCUMENT / IS NOT DOCUMENT – Use to filter between DOCUMENT and CONTENT – Only works with data that is already cast as XML pgday=# SELECT XMLSERIALIZE(DOCUMENT data AS text) FROM pgday-# WHERE data IS DOCUMENT; pgday=# SELECT XMLSERIALIZE(CONTENT data AS text) FROM pgday-# WHERE data IS NOT DOCUMENT;
  13. Handling XML and JSON in the Database Mike Fowler, [email protected]

    PGDayUK 2013 XML: Predicates • xml_is_well_formed() – Introduced in 9.1 and takes text as a parameter – Sensitive to XMLOPTION • SET xmloption DOCUMENT; – Also xml_is_well_formed_documet() and xml_is_well_formed_content()
  14. Handling XML and JSON in the Database Mike Fowler, [email protected]

    PGDayUK 2013 XML: xpath() xpath(xpath, xml [, nsarray]) • Allows you to extract elements and text • Supports namespaces • Returns an array of XML • Also returns empty arrays when there is no match SELECT xpath('//title/text()',data) FROM ...
  15. Handling XML and JSON in the Database Mike Fowler, [email protected]

    PGDayUK 2013 XML: XMLEXISTS XMLEXISTS(text PASSING [BY REF] xml [BY REF]) • From the standard, useful for predicates • First parameter is an xpath expression SELECT xpath('//title/text()',data) FROM demo WHERE XMLEXISTS('//title' PASSING data);
  16. Handling XML and JSON in the Database Mike Fowler, [email protected]

    PGDayUK 2013 XML: xpath_exists() xpath_exists(xpath, xml [, nsarray]) • Serves the same purpose as XMLEXISTS but: – Supports namespaces – Syntax is simplier SELECT xpath('//title/text()',data) FROM demo WHERE xpath_exists('//title', data);
  17. Handling XML and JSON in the Database Mike Fowler, [email protected]

    PGDayUK 2013 Primer: JSON • JavaScript Object Notation • Also a human readable data interchange & serialisation format • { } denote objects containing a comma separated list of name value pairs where values can be nested objects or arrays { “name”:”value”, “nestedObject”: { “array”:[“value1”,”value2”] }, “numeric”:13 }
  18. Handling XML and JSON in the Database Mike Fowler, [email protected]

    PGDayUK 2013 JSON Support • Not a SQL standard • PostgreSQL is currently the only RDBMS with native support • Design borrows heavily from PostgreSQL's existing XML support – Internal storage format is TEXT – Basic validation (e.g. well formed) • Does not require special libraries to enable
  19. Handling XML and JSON in the Database Mike Fowler, [email protected]

    PGDayUK 2013 JSON Support • Basic support introduced in 9.2 – JSON datatype – 2 support functions • row_to_json() • array_to_json() • Operators and additional functions in 9.3 – ->,->>,#>,#>> – 10 more support functions
  20. Handling XML and JSON in the Database Mike Fowler, [email protected]

    PGDayUK 2013 JSON Support • INSERT always ensures type validity pgday=# INSERT INTO demo VALUES pgday-# ('{"name":"Mike","hungry":true}'); INSERT 0 1 pgday=# INSERT INTO demo VALUES pgday-# ('{"name":"Mike",hungry:true}'); ERROR: invalid input syntax for type json at character 28
  21. Handling XML and JSON in the Database Mike Fowler, [email protected]

    PGDayUK 2013 JSON: Operators • Retrieve the value of an attribute SELECT data->'name' AS name FROM … • Use the value of an attribute in a predicate SELECT data FROM demo WHERE data->>'name' = 'Mike';
  22. Handling XML and JSON in the Database Mike Fowler, [email protected]

    PGDayUK 2013 JSON: Functions row_to_json(record [, pretty_bool]) Returns a result set where each row is a JSON object pgday=# SELECT * FROM demo; username | posts | email ----------+-------+------------------- mlfowler | 121 | [email protected] fowlerm | 9 | [email protected] pgday=# SELECT row_to_json(demo) FROM demo; row_to_json ----------------------------------------------------------------- {"username":"mlfowler","posts":121,"email":"[email protected]"} {"username":"fowlerm","posts":9,"email":"[email protected]"}
  23. Handling XML and JSON in the Database Mike Fowler, [email protected]

    PGDayUK 2013 JSON: Functions json_object_keys(json) Returns all the keys in the top level pgday=# SELECT json_object_keys(data) FROM demo; json_object_keys ------------------ name hungry
  24. Handling XML and JSON in the Database Mike Fowler, [email protected]

    PGDayUK 2013 Caveats • They are not indexable – You will need to plan how best to retrieve • Best used programmaticly – Syntax of XML and JSON can be unwieldy • Non-UTF8 encoded databases • Not portable
  25. Handling XML and JSON in the Database Mike Fowler, [email protected]

    PGDayUK 2013 Summary • Using the PostgreSQL XML and JSON datatypes allows you finer control of an otherwise free format field • PostgreSQL's XML non-standard support is much easier to use than the standard • JSON is still being very actively developed