Upgrade to Pro — share decks privately, control downloads, hide ads and more …

JSON Schema - Core Concepts, Common Pitfalls, and Debugging

Ben Hutton
October 16, 2019

JSON Schema - Core Concepts, Common Pitfalls, and Debugging

Access the interactive deck at https://stoic-agnesi-d0ac4a.netlify.com

Fund more of these talks: https://opencollective.com/json-schema
Tip this talk: https://ko-fi.com/relequestual

JSON Schema appears pretty simple on the surface, and it can be used to create really simple validation, but sometimes there are more complex structures you want to validate.
Looking after the official JSON Schema slack server for the past few years has highlighted a number of common problems and pitfalls.

You'll be taken on a journey of disovery to see how reframing your understanding of JSON Schema documents can help avoid common problems and pitfalls, and how to understand core concepts such as applicability.

Ben Hutton

October 16, 2019
Tweet

More Decks by Ben Hutton

Other Decks in Technology

Transcript

  1. JSON Schema Core Concepts, Common Pitfalls, and Debugging Ben Hutton

    @relequestual on the internet JSON Schema core opencollective.com/json-schema ☕ ko-fi.com/relequestual ASC 2019 - API Specification Conference 2019 These slides have interactive code which does not display fully in PDF. Please see https://stoic-agnesi-d0ac4a.netlify.com for the full version. All code used and created for this deck is avilable on github.
  2. Matchmaker Exchange API Siloed databases of patient data (including DNA

    variants) Rare and undiagnosed cases (> 10 per country rare) Complex logal and ethical situation Discover and exchange pateint data
  3. Matchmaker Exchange API Each database holds slightly different data Representation

    of a patient needs to be uniform Write specification using words and examples Release v1.0! Sounds easy, right?
  4. Requirements for queries Root object must have a patient patient

    must have either properties genomicFeatures or phenotypicFeatures or both
  5. { "$schema": "http://json-schema.org/draft-07/schema", "title": "MatchMakerExchange format for queries", "patient": {

    "$comment": "`patient`: `genomicFeatures` or `phenotypicFeatures` or both", "anyOf": [ "genomicFeatures", "phenotypicFeatures" ] } } This is an unknown keyword. Unknown keywords are ignored.
  6. JSON Schema A vocabulary that allows you to annotate and

    validate JSON documents. An IETF "personal draft" document specification JSON! Must be a boolean or an object
  7. JSON Schema This talk covers draft-7. Not draft-4, not draft-8

    draft-8 "draft 2019-09 " is out, but implementations need to catch up draft-7 is the latest well supported version
  8. Key Concepts Schema “keywords” : Object properties that are applied

    to the instance Instance : The JSON document being validated or associated with a Schema Root Schema : A Schema that is the whole JSON document Subschema : A Schema as a value of an object or array Keywords fall under several behavior categories which include: Assertions : produce a boolean result when applied to an instance Annotations : attach information to an instance for application use
  9. "properties" keyword properties is an object The values of the

    object are applied to the instance's matching key's value
  10. { "$schema": "http://json-schema.org/draft-07/schema", "title": "MatchMakerExchange format for queries", "properties": {

    "patient": { "$comment": "`patient`: `genomicFeatures` or `phenotypicFeatures` or both", "anyOf": [ { "properties": { "phenotypicFeatures": true } }, { "properties": { "genomicFeatures": true } } ] } } } Remember, the values of a `properties` object must be Schemas. A Boolean is a valid Schema.
  11. An Object or a Boolean Always passes validation {} An

    empty object true a `true` boolean value Always fails validation { "not": {} } `not` keyword: inverts the assertion false a `false` boolean value
  12. Does it work? { "$schema": "http://json-schema.org/draft-07/schema", "title": "MatchMakerExchange format for

    queries", "properties": { "patient": { "anyOf": [ { "properties": { "phenotypicFeatures": { "type": [ "array" ] } } }, { "properties": { "genomicFeatures": { "type": [ "array" ] } } } ] } } { "patient": { "phenotypicFeatures": "long nose" } } But didn't I define that should be an array?
  13. (More) Key Concepts Applicator keywords: Determine how subschemas are applied

    to an instance location. Include, but not limited to; oneOf , allOf , and anyOf
  14. *Of applicators The *Of keywords values must be an array

    of Schemas. The result of applying a schema to an instance includes an assertion of validity. The resulting assertions are modified or combined to produce a final result. For example, oneOf requires that ONLY one of the Schemas in the array is valid.
  15. Why doesn't it work? { "$schema": "http://json-schema.org/draft-07/schema", "title": "MatchMakerExchange format

    for queries", "properties": { "patient": { "anyOf": [ { "properties": { "phenotypicFeatures": { "type": [ "array" ] } }, "additionalProperties": false }, { "properties": { "genomicFeatures": { "type": [ "array" ] } }, "additionalProperties": false } ] } } } { "patient": { "phenotypicFeatures": "long nose" } } ... ❌ YES! Validation fails, as expected
  16. { "$schema": "http://json-schema.org/draft-07/schema", "title": "MatchMakerExchange format for queries", "properties": {

    "patient": { "anyOf": [ { "required": [ "phenotypicFeatures" ], "properties": { "phenotypicFeatures": { "type": [ "array" ] } }, "additionalProperties": false }, { "required": [ "genomicFeatures" ], "properties": { "genomicFeatures": { "type": [ "array" ] } }, "additionalProperties": false } ] } { "patient": { } } ❌ Validation now fails as expected!
  17. What did we discover? Some keywords have values that are

    a JSON Schema (a subschema) *Of keywords are applicators that take an array of subschemas You can set the value of a subschema to a boolean false to check the validation path is as you expect The values of a properties object are subschemas which are applied to an objects values for matching keys
  18. Spec change Tidy the subschemas into definitions Refactor the schema

    patient now needs to incldue some additional fields. A definition is provided.
  19. { "$schema": "http://json-schema.org/draft-07/schema", "title": "MatchMakerExchange format for queries", "definitions": {

    "phenotypicFeatures": { "type": [ "array" ] }, "genomicFeatures": { "type": [ "array" ] }, "geneticsPatient": { "properties": { "phenotypicFeatures": { "$ref": "#/definitions/phenotypicFeatures" }, "genomicFeatures": { "$ref": "#/definitions/genomicFeatures" } }, "additionalProperties": false, "anyOf": [ { "required": [ "phenotypicFeatures" ] }, { "required": [ "genomicFeatures" ] } ] }, "regularPatient": { "type": "object", "required": [ "name" ], "properties": { "name": { "type": [ "string" ] } } } }, "properties": { "patient": { "allOf": [ { "$ref": "#/definitions/regularPatient" }, { "$ref": "#/definitions/geneticsPatient" } ] Validation always fails... how do we fix this?
  20. Validation Error: should NOT have additional properties. additionalProperties at "#/additionalProperties"

    Instance location: "/patient" { "patient": { "phenotypicFeatures": ["long nose"], "name": "bob" } } Does it work? ❌ Validation always fails... how do we fix this?
  21. required : [ "genomicFeatures" ] } ] }, "regularPatient": {

    "type": "object", "required": [ "name" ], "properties": { "name": { "type": [ "string" ] } } } }, "properties": { "patient": { ... and from `geneticsPatient`, but no others.
  22. additionalProperties What specifically does additionalProperties DO? additionalProperties takes a schema

    as it's value. Often this is boolean false . The schema is only applied to the instance object's values, where the keys have not already been evaluated as a result of properties or patternProperties within the same schema object. additionalProperties cannot "see through" applicator keywords (such as allOf and $ref references) Validation with "additionalProperties" applies only to the child values of instance names that do not match any names in "properties", and do not match any regular expression in "patternProperties".
  23. type : [ "string" ] } } } }, "properties":

    { "patient": { "additionalProperties": false, "properties": { "name": true, "phenotypicFeatures": true, "genomicFeatures": true }, "allOf": [ { "$ref": "#/definitions/regularPatient" }, { "$ref": "#/definitions/geneticsPatient" } The only solution is to add `properties`, but we don't want to define any constraints on their value in this subschema
  24. { "$schema": "http://json-schema.org/draft-07/schema", "title": "MatchMakerExchange format for queries", "definitions": {

    "phenotypicFeatures": { "type": [ "array" ] }, "genomicFeatures": { "type": [ "array" ] }, "geneticsPatient": { "properties": { "phenotypicFeatures": { "$ref": "#/definitions/phenotypicFeatures" }, "genomicFeatures": { "$ref": "#/definitions/genomicFeatures" } }, "anyOf": [ { "required": [ "phenotypicFeatures" ] }, { "required": [ { "patient": { "phenotypicFeatures": ["long nose"], "name": "bob" } } `phenotypicFeatures` and `genomicFeatures` are defined in `definitions` and are referenced by the individual properties of `geneticsPatient`
  25. allOf > $ref schemas and additionalProperties: false Being able to

    combine schema objects without additional properties is a VERY common pain point. We recognised this and fixed it in draft-8, but that's a whole other talk.
  26. What else did we discover? Schemas may be boolean values,

    even when as a value of the properties object Constructing and merging complex schemas may require some duplication additionalProperties CANNOT "See Through" applicator keywords like *Of or $ref
  27. JSON Schema Core Concepts, Common Pitfalls, and Debugging Thank you

    sponsors! Ben Hutton @relequestual on the internet JSON Schema core opencollective.com/json-schema ☕ ko-fi.com/relequestual ASC 2019 - API Specification Conference 2019