Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Agile Database Development with JSON

D7b6e701f0155fc189bbca6c89223b3c?s=47 Chris
June 18, 2020

Agile Database Development with JSON

An overview of JSON functionality in Oracle Database

D7b6e701f0155fc189bbca6c89223b3c?s=128

Chris

June 18, 2020
Tweet

Transcript

  1. 1 Agile Database Development with JSON Chris Saxon Developer Advocate,

    @ChrisRSaxon & @SQLDaily blogs.oracle.com/sql youtube.com/c/TheMagicofSQL asktom.oracle.com
  2. Image by Semevent from Pixabay We're creating a new online

    store Selling boxes of brick models
  3. Photo by Jon Tyson on Unsplash We need to respond

    to customer feedback…
  4. Photo by Johannes Plenio on Unsplash …and evolve the application

    rapidly
  5. Photo by Brannon Naito on Unsplash working in short sprints

    and releasing often So we need to be Agile
  6. { JSON } To support this we'll store data as

  7. 7 Agile Database Development with JSON Chris Saxon Developer Advocate,

    @ChrisRSaxon & @SQLDaily blogs.oracle.com/sql youtube.com/c/TheMagicofSQL asktom.oracle.com
  8. The following is intended to outline our general product direction.

    It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. The development, release, timing, and pricing of any features or functionality described for Oracle’s products may change and remains at the sole discretion of Oracle Corporation. Statements in this presentation relating to Oracle’s future plans, expectations, beliefs, intentions and prospects are “forward-looking statements” and are subject to material risks and uncertainties. A detailed discussion of these factors and other risks that affect our business is contained in Oracle’s Securities and Exchange Commission (SEC) filings, including our most recent reports on Form 10-K and Form 10-Q under the heading “Risk Factors.” These filings are available on the SEC’s website or on Oracle’s website at http://www.oracle.com/investor. All information in this presentation is current as of September 2019 and Oracle undertakes no duty to update any statement in light of new information or future events. Safe Harbor
  9. User Story #1 We must be able to store product

    & order details So we need to create the tables and define CRUD operations on them
  10. create table products ( product_id integer not null primary key,

    product_json ##TODO## not null, check ( product_data is json ) ); create table orders ( order_id integer not null primary key, order_json ##TODO## not null, check ( order_data is json ) ); The tables are just a primary key, JSON column, & is json constraint
  11. create table products ( product_id integer not null primary key,

    product_json ##TODO## not null, check ( product_data is json ) ); create table orders ( order_id integer not null primary key, order_json ##TODO## not null, check ( order_data is json ) ); But which data type to use for JSON?!
  12. Which data type should you use for JSON? "Small" documents

    varchar2 "Large" documents ??? <= 4,000 bytes / 32k
  13. "Small" documents varchar2 "Large" documents blob JSON data type in

    21c Avoids character set conversions Less storage than clob
  14. create table products ( product_id integer not null primary key,

    product_json blob not null, check ( product_data is json ) ); create table orders ( order_id integer not null primary key, order_json blob not null, check ( order_data is json ) );
  15. insert into products ( product_json ) values ( utl_raw.cast_to_raw (

    '{ "productName": "..." }' ) ); BLOBs need extra processing on insert
  16. select product_json from products; PRODUCT_JSON 7B202274686973223A20227468617422207D and select to make

    them human readable
  17. select json_serialize ( product_json returning clob pretty ) jdata from

    products; JDATA { "productName": "..." } Added in 19c json_serialize converts JSON data to text; which you can pretty print for readability
  18. select json_query ( product_json, '$' returning clob pretty ) jdata

    from products; JDATA { "productName": "..." } In earlier releases use json_query The clob return type was added in 18c
  19. User Story #2 Customers must be able to search by

    price So we need to query the products table for JSON where the unitPrice is in the specified range
  20. { "productName": "GEEKWAGON", "descripion": "Ut commodo in …", "unitPrice": 35.97,

    "bricks": [ { "colour": "red", "shape": "cube", "quantity": 13 }, { "colour": "green", "shape": "cube", "quantity": 17 }, … ] } We need to search for this value in the documents
  21. select * from products p where p.product_json.unitPrice <= :max_price; But

    remember it returns varchar2 => implicit conversion! Use simple dot-notation to access the value
  22. select * from products p where json_value ( product_json, '$.unitPrice'

    returning number ) <= :max_price; json_value gives you more control So this returns number => no implicit conversion! :)
  23. select * from products p where p.product_json.unitPrice.number() <= :max_price; From

    19c you can state the return type with simple dot-notation
  24. User Story #3 Customers must be able to view their

    orders Showing order details and a list of what they bought So we need to join the order productIds to products
  25. { "customerId" : 2, "orderDatetime" : "2019-01-01T03:25:43", "products" : [

    { "productId" : 1, "unitPrice" : 74.95 }, { "productId" : 10, "unitPrice" : 35.97 }, … ] } We need to extract these from the product array
  26. select o.order_json.products[*].productId from orders o; PRODUCTS [2,8,5] [3,9,6] [1,10,7,4] ...

    With simple dot-notation, you can get an array of the values…
  27. select json_query ( order_json, '$.products[*].productId' with array wrapper ) from

    orders o; PRODUCTS [2,8,5] [3,9,6] [1,10,7,4] ... But to join these to products, we need to convert them to rows… …or with json_query
  28. json_table With json_table you can convert JSON… …to relational rows-and-columns

  29. with order_items as ( select order_id, t.* from orders o,

    json_table ( order_json columns ( customerId, nested products[*] columns ( productId, unitPrice ) ) ) t ) Simplified syntax 18c
  30. with order_items as ( select order_id, t.* from orders o,

    json_table ( order_json columns ( customerId, nested products[*] columns ( productId, unitPrice ) ) ) t ) This tells the database to return a row for each element in the products array…
  31. select order_id, p.product_json.productName product, unitPrice from order_items oi join products

    p on oi.productId = p.product_id where customerId = :cust_var order by oi.order_id desc, p.product_id …So you can join output to the products table!
  32. Minimum viable product complete! Ship it!

  33. Copyright © 2019 Oracle and/or its affiliates. Soooo… How many

    orders today? …people have lots of questions As always, post release… Ryan McGuire / Gratisography
  34. User Story #4 Sales must be able to view today's

    orders We need to create a dashboard counting orders So we need to search for orders placed today
  35. { "customerId" : 2, "orderDatetime" : "2019-01-01T03:25:43", "products" : [

    { "productId" : 1, "unitPrice" : 74.95 }, { "productId" : 10, "unitPrice" : 35.97 }, … ] } We need to search for this value in the documents
  36. select * from orders o where o.order_json.orderDatetime >= trunc (

    sysdate ); ORA-01861: literal does not match format string Remember the implicit conversions? It fails for dates! Use simple dot-notation to access the value
  37. select * from orders o where json_value ( order_json, '$.orderDatetime'

    returning date ) >= trunc ( sysdate ) So you need to define the return type; JSON dates conform to ISO 8601 date
  38. 2019-01-01 ISO 8601 date Which is YYYY-MM-DD for dates There

    is no time component in an ISO date!
  39. 2019-01-01T03:25:43 ISO 8601 timestamp Use ISO timestamps to include times

    Note the "T" between the date and time!
  40. select * from orders o where json_value ( order_json, '$.orderDatetime'

    returning date ) >= trunc ( sysdate ) But the query is very slow…
  41. select * from orders o where json_value ( order_json, '$.orderDatetime'

    returning date ) >= trunc ( sysdate ) { "customerId": 1, … } { "customerId": 2, … } …
  42. User Story #4b … and make it fast! currently the

    query does a full table scan To speed it up we need to create an index!
  43. create index orders_date_i on orders ( order_json ); ORA-02327: cannot

    create index on expression with datatype LOB You can't index LOB data
  44. create search index orders_json_i on orders ( order_json ) for

    json parameters ( 'sync (on commit)' ); Added in 12.2, a json search index enables JSON queries to use an index JSON Search Indexes
  45. select * from orders o where json_value ( order_json, '$.orderDatetime'

    returning date ) >= trunc ( sysdate ) { "customerId": 1, … } { "customerId": 2, … } …
  46. ----------------------------------------------------- | Id | Operation | Name | ----------------------------------------------------- |

    0 | SELECT STATEMENT | | |* 1 | TABLE ACCESS BY INDEX ROWID| ORDERS | |* 2 | DOMAIN INDEX | ORDERS_JSON_I | ----------------------------------------------------- With the search index in place, the optimizer can use it
  47. Predicate Information (identified by operation id): --------------------------------------------------- 1 - filter(JSON_VALUE("ORDER_JSON"

    FORMAT JSON , '$.orderDatetime' RETURNING TIMESTAMP NULL ON ERROR) >= TIMESTAMP' 2019-01-15 00:00:00') 2 - access("CTXSYS"."CONTAINS"("O"."ORDER_JSON", 'sdatap(TMS_orderDatetime >= "2019-01-15T00:00:00+00:00" /orderDatetime)')>0) Under the covers, this uses Oracle Text
  48. create index order_date_i on orders ( json_value ( order_json, '$.orderDatetime'

    returning date error on error null on empty ) ); It's more efficient to create a function- based index, matching the search you'll do This has some other benefits…
  49. create index order_date_i on orders ( json_value ( order_json, '$.orderDatetime'

    returning date error on error null on empty ) ); Data validation! If the value is not a JSON date; inserts will raise an exception
  50. create index order_date_i on orders ( json_value ( order_json, '$.orderDatetime'

    returning date error on error null on empty ) ); From 12.2 you can also raise an error if the attribute is not present
  51. ------------------------------------------------------------ | Id | Operation | Name | ------------------------------------------------------------ |

    0 | SELECT STATEMENT | | | 1 | TABLE ACCESS BY INDEX ROWID BATCHED| ORDERS | |* 2 | INDEX RANGE SCAN | ORDER_DATE_I | ------------------------------------------------------------ The function-based index is more efficient, so the optimizer will choose this over the search index
  52. Search vs. Function-Based Indexes JSON Search Index Function-based Index Applicability

    Any JSON query Matching function Performance Slower Faster Use Ad-hoc queries Application queries
  53. 0 5 10 15 20 25 With the dashboard in

    place, it's clear sales are levelling off We need a way to increase sales!
  54. We need to offer discounts! …discount promotion codes Marketing have

    a brilliant plan… Ryan McGuire / Gratisography
  55. User Story #5 Customers may be able to enter a

    promotion code This will give a discount We need to store the code and discount value
  56. { …, "promotion": { "code": "20OFF", "discountAmount": 20 } }

    The order JSON will include the an promotion object… …so there are no changes needed in the database!
  57. Nothing to do in the database! relax! So you can

    sit back and count the money! Ryan McGuire / Gratisography
  58. 0 20 40 60 80 100 120 Customers love the

    promotion Sales are going through the roof!
  59. Cake for everyone! The promotion is a success! Ryan McGuire

    / Gratisography
  60. Where's the $$$?! …the $$$ tells a different story But

    finance are unhappy… Ryan McGuire / Gratisography
  61. -250 -200 -150 -100 -50 0 50 100 150 Red

    bars = sales Yellow line = profits The discount is too big! We're losing money!
  62. Finance need to view order profitability They need to understand

    where we're losing money
  63. User Story #6 Store unit cost for each brick We

    need to update the product JSON; adding unitCost to every object in the bricks arrays
  64. { …, "bricks": [ { "colour": "red", "shape": "cube", "quantity":

    13 }, { "colour": "green", "shape": "cuboid", "quantity": 17 }, … ] } Add unitCost
  65. "Luckily" we have the costs in a spreadsheet!

  66. "bricks": [ { "colour": "red", "shape": "cube", "quantity": 13 },

    { "colour": "green", "shape": "cuboid", "quantity": 17 }, … ] join on colour, shape We need to combine the spreadsheet data with the stored JSON
  67. Step 1: transform JSON to rows-and-columns Step 3: convert back

    to JSON Step 2: join the costs
  68. Photo by Gus Ruballo on Unsplash Buckle up! This will

    be a bumpy ride!
  69. select * from external ( ( colour varchar2(30), shape varchar2(30),

    unit_cost number ) default directory tmp location ( 'costs.csv' ) ) From 18c you can query files "on the fly" with an inline external table
  70. select product_id, j.* from products, json_table ( product_json columns (

    nested bricks[*] columns ( pos for ordinality, colour path '$.colour', shape path '$.shape', brick format json path '$' ) ) ) j Using JSON_table to extract the bricks as rows
  71. select product_id, j.* from products, json_table ( product_json columns (

    nested bricks[*] columns ( pos for ordinality, colour path '$.colour', shape path '$.shape', brick format json path '$' ) ) ) j
  72. select product_id, j.* from products, json_table ( product_json columns (

    nested bricks[*] columns ( pos for ordinality, colour path '$.colour', shape path '$.shape', brick format json path '$' ) ) ) j
  73. with costs as ( select * from external … ),

    bricks as ( select product_id, j.* from products, json_table ( … ) ) select … from bricks join costs on … We've joined the data, but how do we convert it back to JSON?
  74. json_object json_objectagg json_array json_arrayagg (12.2) JSON Generation Functions

  75. select json_object ( 'colour' value b.colour, 'shape' value b.shape, 'quantity'

    value b.quantity, 'unitCost' value c.cost ) from bricks b join costs c on b.colour = c.colour and b.shape = c.shape; So you can create a brick object with json_object…
  76. select json_mergepatch ( brick, '{ "unitCost": ' || c.cost ||

    '}' ) from bricks b join costs c on b.colour = c.colour and b.shape = c.shape; Add/replace this… …to this document … or use json_mergepatch (19c) to add it to the brick object
  77. { "colour": "red", "shape": "cube", "quantity": 13, "unitCost": 0.59 }

    { "colour": "green", "shape": "cuboid", "quantity": 17, "unitCost": 0.39 } This returns a row for each brick To combine them into an array for each product, use json_arrayagg
  78. json_arrayagg ( json_mergepatch ( brick, '{ "unitCost": ' || cost

    || '}' ) order by pos )
  79. [ { "colour": "red", "shape": "cube", "quantity": 13, "unitCost": 0.59

    }, { "colour": "green", "shape": "cuboid", "quantity": 17, "unitCost": 0.39 }, … ] Make the array into an object with json_object
  80. json_object ( 'bricks' value json_arrayagg ( json_mergepatch ( brick, '{

    "unitCost": ' || cost || '}' ) order by pos ) )
  81. "bricks": [ { "colour": "red", "shape": "cube", "quantity": 13, "unitCost":

    0.59 }, { "colour": "green", "shape": "cuboid", "quantity": 17, "unitCost": 0.39 }, … ] And replace this array in the product JSON with json_mergepatch
  82. json_mergepatch ( product, json_object ( 'bricks' value json_arrayagg ( json_mergepatch

    ( brick, '{ "unitCost": ' || cost || '}' ) order by pos ) ) )
  83. { "productName": "GEEKWAGON", "descripion": "Ut commodo in …", "unitPrice": 35.97,

    "bricks": [ { …, "unitCost": 0.59 }, { …, "unitCost": 0.39 }, … ] } Finally! We've added unitCost to every element in the array We just need to update the table…
  84. update products set product_json = ( with costs as (

    select * from external … ), bricks as ( select … ) select json_mergepatch … )
  85. …at least we can view order profitability now That was

    hard work Ryan McGuire / Gratisography
  86. User Story #7 Create report prices - discount – total

    cost We've got the data; but want an easier way to query it…
  87. dbms_json.add_virtual_columns ( 'orders', 'order_json' ); JSON Data Guide Added in

    12.2, the JSON Data Guide enables you to expose attributes as virtual columns in the table. To do this, the column must have a json search index
  88. desc orders Name Null? Type ORDER_ID NOT NULL NUMBER(38) ORDER_JSON

    NOT NULL BLOB ORDER_JSON$customerId NUMBER ORDER_JSON$orderDatetime VARCHAR2(32) ORDER_JSON$code VARCHAR2(8) ORDER_JSON$discountAmount NUMBER Sadly it only exposes scalar (non-array) values
  89. dbms_json.create_view_on_path ( 'product_bricks_vw', 'products', 'product_json', '$' ); …using json_table on

    this! Create this view… Luckily you can create a view instead!
  90. select product_id, "PRODUCT_JSON$shape" shape, "PRODUCT_JSON$colour" colour from product_bricks_vw order by

    product_id, shape, colour You can now query the view to see JSON as rows-and-columns
  91. PRODUCT_ID SHAPE COLOUR 1 cube green 1 cube red 1

    cylinder blue 1 cylinder blue 1 cylinder green 1 cylinder green … … … The unique key for a brick is (colour, shape) Some products have duplicate entries in the bricks array! We're shipping too many bricks!
  92. User Story #8 FIX ALL THE DATAZ! We need to

    remove all the duplicate entries from the product brick arrays
  93. { ..., "bricks" : [ { "colour" : "red", "shape"

    : "cylinder", "quantity" : 20, "unitCost" : 0.39 }, { "colour" : "red", "shape" : "cylinder", "quantity" : 20, "unitCost" : 0.39 } { ..., "bricks" : [ { "colour" : "red", "shape" : "cylinder", "quantity" : 8, "unitCost" : 0.39 }, { "colour" : "blue", "shape" : "cylinder", "quantity" : 10, "unitCost" : 0.98 } Comparing the brick arrays for two products shows unitCost is duplicated
  94. { ..., "bricks" : [ { "colour" : "red", "shape"

    : "cylinder", "quantity" : 20, "unitCost" : 0.39 }, { "colour" : "red", "shape" : "cylinder", "quantity" : 20, "unitCost" : 0.39 } { ..., "bricks" : [ { "colour" : "red", "shape" : "cylinder", "quantity" : 8, "unitCost" : 0.39 }, { "colour" : "blue", "shape" : "cylinder", "quantity" : 10, "unitCost" : 0.98 } And the brick itself is duplicated within an array
  95. Wrong Data Model PRODUCTS BRICKS The JSON models the relationship

    between products and bricks as 1:M This is the wrong data model the relationship is M:M
  96. Fixed It! PRODUCTS BRICKS PRODUCT_BRICKS unique ( product_id, brick_id )

    { JSON } { JSON } { JSON } You need a junction table between products and bricks This avoids duplication & enables constraints
  97. You still need to model { JSON } data!

  98. Copyright © 2019 Oracle and/or its affiliates. "The more I

    work with existing NoSQL deployments however, the more I believe that their schemaless nature has become an excuse for sloppiness and unwillingness to dwell on a project’s data model beforehand" - Florents Tselai https://tselai.com/modern-data-practice-and-the-sql-tradition.html
  99. select distinct "PRODUCT_JSON$shape" shape, "PRODUCT_JSON$colour" colour, "PRODUCT_JSON$unitCost" unit_cost from product_bricks_vw

    Moving from 1:M to M:M Using the JSON Data Guide view, you can find all the unique brick types…
  100. with vals as ( select distinct "PRODUCT_JSON$shape" shape, "PRODUCT_JSON$colour" colour,

    "PRODUCT_JSON$unitCost" unit_cost from product_bricks_vw ) select rownum brick_id, v.* from vals v; …assign a unique ID to each ( colour, shape ) …
  101. create table bricks as with vals as ( select distinct

    "PRODUCT_JSON$shape" shape, "PRODUCT_JSON$colour" colour, "PRODUCT_JSON$unitCost" unit_cost from product_bricks_vw ) select rownum brick_id, v.* from vals v; …and create a table from the results!
  102. create table bricks as with vals as ( select distinct

    "PRODUCT_JSON$shape" "shape", "PRODUCT_JSON$colour" "colour", "PRODUCT_JSON$unitCost" "unitCost" from product_bricks_vw ) select rownum brick_id, json_object ( v.* ) brick_json from vals v; 19c simplification (Storing the values as JSON if you want)
  103. create table product_bricks as select distinct product_id, brick_id from product_bricks_vw

    join bricks on ... Create the Join Table
  104. json_mergepatch ( product_json, '{ "bricks": null }' ) If you

    pass a null value for an attribute to JSON_mergepatch, it's removed from the source Removing the bricks array from products
  105. When should I store { JSON }?

  106. Storing JSON can be the right choice for… JSON responses

    - 3rd party APIs - IoT devices Schema extensions - flex fields - sparse columns 1 2
  107. Further Reading How to Store, Query, and Create JSON Documents

    in Oracle Database Blog Post http://bit.ly/json-in-oracle-db Presentation Live SQL Scripts http://bit.ly/agile-json-livesql Copyright © 2019 Oracle and/or its affiliates.
  108. Copyright © 2019 Oracle and/or its affiliates. VS Some people

    suggest JSON and relational are fundamentally different
  109. Copyright © 2019 Oracle and/or its affiliates. VS This is

    not the case! However you store data, you still need to normalize it to avoid duplication and errors
  110. Copyright © 2019 Oracle and/or its affiliates. How you store

    the data is a spectrum from just rows-and-columns to wholly JSON and everything in-between
  111. Copyright © 2019 Oracle and/or its affiliates. Oracle Database supports

    it all! However you store your data
  112. asktom.oracle.com #MakeDataGreatAgain Ryan McGuire / Gratisography