Upgrade to Pro — share decks privately, control downloads, hide ads and more …

JSON Schema - Core Concepts, Common Pitfalls, and Debugging

Ben Hutton
October 16, 2019

JSON Schema - Core Concepts, Common Pitfalls, and Debugging

Access the interactive deck at https://stoic-agnesi-d0ac4a.netlify.com

Fund more of these talks: https://opencollective.com/json-schema
Tip this talk: https://ko-fi.com/relequestual

JSON Schema appears pretty simple on the surface, and it can be used to create really simple validation, but sometimes there are more complex structures you want to validate.
Looking after the official JSON Schema slack server for the past few years has highlighted a number of common problems and pitfalls.

You'll be taken on a journey of disovery to see how reframing your understanding of JSON Schema documents can help avoid common problems and pitfalls, and how to understand core concepts such as applicability.

Ben Hutton

October 16, 2019
Tweet

More Decks by Ben Hutton

Other Decks in Technology

Transcript

  1. JSON Schema
    Core Concepts, Common Pitfalls, and Debugging
    Ben Hutton
    @relequestual on the internet
    JSON Schema core
    opencollective.com/json-schema
    ☕ ko-fi.com/relequestual
    ASC 2019 - API Specification Conference 2019
    These slides have interactive code which does not display fully in PDF.
    Please see https://stoic-agnesi-d0ac4a.netlify.com for the full version.
    All code used and created for this deck is avilable on github.

    View Slide

  2. Validation saves lives

    View Slide

  3. Validation (probably?) saves lives

    View Slide

  4. Validation changes lives
    Validation (probably?) saves lives

    View Slide

  5. Story

    View Slide

  6. Matchmaker Exchange API
    Siloed databases of patient data
    (including DNA variants)
    Rare and undiagnosed cases
    (> 10 per country rare)
    Complex logal and ethical situation
    Discover and exchange pateint data

    View Slide

  7. Matchmaker Exchange API
    Each database holds slightly different data
    Representation of a patient needs to be uniform
    Write specification using words and examples
    Release v1.0!
    Sounds easy, right?

    View Slide

  8. Nope

    View Slide

  9. Building a schema

    View Slide

  10. Requirements for queries
    Root object must have a patient
    patient must have either properties
    genomicFeatures or
    phenotypicFeatures or
    both

    View Slide

  11. {
    "$schema": "http://json-schema.org/draft-07/schema",
    "title": "MatchMakerExchange format for queries",
    "patient": {
    "$comment": "`patient`: `genomicFeatures` or `phenotypicFeatures` or both",
    "anyOf": [
    "genomicFeatures",
    "phenotypicFeatures"
    ]
    }
    }
    This is an unknown keyword. Unknown keywords are ignored.

    View Slide

  12. What's a "schema" anyway?

    View Slide

  13. JSON Schema
    A vocabulary that allows you to annotate and validate JSON documents.
    An IETF "personal draft" document specification
    JSON!
    Must be a boolean or an object

    View Slide

  14. JSON Schema
    This talk covers draft-7. Not draft-4, not draft-8
    draft-8 "draft 2019-09 " is out, but implementations need to catch up
    draft-7 is the latest well supported version

    View Slide

  15. Key Concepts
    Schema “keywords” : Object properties that are applied to the instance
    Instance : The JSON document being validated or associated with a Schema
    Root Schema : A Schema that is the whole JSON document
    Subschema : A Schema as a value of an object or array
    Keywords fall under several behavior categories which include:
    Assertions : produce a boolean result when applied to an instance
    Annotations : attach information to an instance for application use

    View Slide

  16. "properties" keyword
    properties is an object
    The values of the object are applied to the instance's matching key's value

    View Slide

  17. {
    "$schema": "http://json-schema.org/draft-07/schema",
    "title": "MatchMakerExchange format for queries",
    "properties": {
    "patient": {
    "$comment": "`patient`: `genomicFeatures` or `phenotypicFeatures` or both",
    "anyOf": [
    {
    "properties": {
    "phenotypicFeatures": true
    }
    },
    {
    "properties": {
    "genomicFeatures": true
    }
    }
    ]
    }
    }
    }
    Remember, the values of a `properties` object must be Schemas. A Boolean is a valid Schema.

    View Slide

  18. A Boolean can be a schema?

    View Slide

  19. What's a Schema again? (pt 2)

    View Slide

  20. An Object or a Boolean
    Always passes validation
    {}
    An empty object
    true
    a `true` boolean value
    Always fails validation
    { "not": {} }
    `not` keyword: inverts the assertion
    false
    a `false` boolean value

    View Slide

  21. Does it work?
    {
    "$schema": "http://json-schema.org/draft-07/schema",
    "title": "MatchMakerExchange format for queries",
    "properties": {
    "patient": {
    "anyOf": [
    {
    "properties": {
    "phenotypicFeatures": {
    "type": [
    "array"
    ]
    }
    }
    },
    {
    "properties": {
    "genomicFeatures": {
    "type": [
    "array"
    ]
    }
    }
    }
    ]
    }
    }
    {
    "patient": {
    "phenotypicFeatures": "long nose"
    }
    }
    But didn't I define that should be an array?

    View Slide

  22. anyOf what?

    View Slide

  23. (More) Key Concepts
    Applicator keywords: Determine how subschemas are applied to an instance location.
    Include, but not limited to;
    oneOf , allOf , and anyOf

    View Slide

  24. *Of applicators
    The *Of keywords values must be an array of Schemas.
    The result of applying a schema to an instance includes an assertion of validity.
    The resulting assertions are modified or combined to produce a final result.
    For example, oneOf requires that ONLY one of the Schemas in the array is valid.

    View Slide

  25. Why doesn't it work?
    {
    "$schema": "http://json-schema.org/draft-07/schema",
    "title": "MatchMakerExchange format for queries",
    "properties": {
    "patient": {
    "anyOf": [
    {
    "properties": {
    "phenotypicFeatures": {
    "type": [
    "array"
    ]
    }
    },
    "additionalProperties": false
    },
    {
    "properties": {
    "genomicFeatures": {
    "type": [
    "array"
    ]
    }
    },
    "additionalProperties": false
    }
    ]
    }
    }
    }
    {
    "patient": {
    "phenotypicFeatures": "long nose"
    }
    }
    ...
    ❌ YES! Validation fails, as expected

    View Slide

  26. We fixed it!
    Almost

    View Slide

  27. {
    "$schema": "http://json-schema.org/draft-07/schema",
    "title": "MatchMakerExchange format for queries",
    "properties": {
    "patient": {
    "anyOf": [
    {
    "required": [
    "phenotypicFeatures"
    ],
    "properties": {
    "phenotypicFeatures": {
    "type": [
    "array"
    ]
    }
    },
    "additionalProperties": false
    },
    {
    "required": [
    "genomicFeatures"
    ],
    "properties": {
    "genomicFeatures": {
    "type": [
    "array"
    ]
    }
    },
    "additionalProperties": false
    }
    ]
    }
    {
    "patient": {
    }
    }
    ❌ Validation now fails as expected!

    View Slide

  28. What did we discover?
    Some keywords have values that are a JSON Schema (a subschema)
    *Of keywords are applicators that take an array of subschemas
    You can set the value of a subschema to a boolean false to check the
    validation path is as you expect
    The values of a properties object are subschemas which are applied to
    an objects values for matching keys

    View Slide

  29. Spec change!
    That doesn't happen, does it?

    View Slide

  30. Spec change
    Tidy the subschemas into definitions
    Refactor the schema
    patient now needs to incldue some additional fields. A definition is
    provided.

    View Slide

  31. Let's do this

    View Slide

  32. {
    "$schema": "http://json-schema.org/draft-07/schema",
    "title": "MatchMakerExchange format for queries",
    "definitions": {
    "phenotypicFeatures": {
    "type": [
    "array"
    ]
    },
    "genomicFeatures": {
    "type": [
    "array"
    ]
    },
    "geneticsPatient": {
    "properties": {
    "phenotypicFeatures": {
    "$ref": "#/definitions/phenotypicFeatures"
    },
    "genomicFeatures": {
    "$ref": "#/definitions/genomicFeatures"
    }
    },
    "additionalProperties": false,
    "anyOf": [
    {
    "required": [
    "phenotypicFeatures"
    ]
    },
    {
    "required": [
    "genomicFeatures"
    ]
    }
    ]
    },
    "regularPatient": {
    "type": "object",
    "required": [
    "name"
    ],
    "properties": {
    "name": {
    "type": [
    "string"
    ]
    }
    }
    }
    },
    "properties": {
    "patient": {
    "allOf": [
    {
    "$ref": "#/definitions/regularPatient"
    },
    {
    "$ref": "#/definitions/geneticsPatient"
    }
    ]
    Validation always fails... how do we fix this?

    View Slide

  33. Validation Error:
    should NOT have additional properties.
    additionalProperties at "#/additionalProperties"
    Instance location: "/patient"
    {
    "patient": {
    "phenotypicFeatures": ["long nose"],
    "name": "bob"
    }
    }
    Does it work?
    ❌ Validation always fails... how do we fix this?

    View Slide

  34. required : [
    "genomicFeatures"
    ]
    }
    ]
    },
    "regularPatient": {
    "type": "object",
    "required": [
    "name"
    ],
    "properties": {
    "name": {
    "type": [
    "string"
    ]
    }
    }
    }
    },
    "properties": {
    "patient": {
    ... and from `geneticsPatient`, but no others.

    View Slide

  35. additionalProperties
    What specifically does additionalProperties DO?
    additionalProperties takes a schema as it's value. Often this is boolean false .
    The schema is only applied to the instance object's values, where the keys have not already been
    evaluated as a result of properties or patternProperties within the same schema
    object.
    additionalProperties cannot "see through" applicator keywords (such as allOf and
    $ref references)
    Validation with "additionalProperties" applies only to the child
    values of instance names that do not match any names in "properties",
    and do not match any regular expression in "patternProperties".

    View Slide

  36. type : [
    "string"
    ]
    }
    }
    }
    },
    "properties": {
    "patient": {
    "additionalProperties": false,
    "properties": {
    "name": true,
    "phenotypicFeatures": true,
    "genomicFeatures": true
    },
    "allOf": [
    {
    "$ref": "#/definitions/regularPatient"
    },
    {
    "$ref": "#/definitions/geneticsPatient"
    }
    The only solution is to add `properties`, but we don't want to define any constraints on their value in
    this subschema

    View Slide

  37. We have our schema
    Let's recap how it works

    View Slide

  38. {
    "$schema": "http://json-schema.org/draft-07/schema",
    "title": "MatchMakerExchange format for queries",
    "definitions": {
    "phenotypicFeatures": {
    "type": [
    "array"
    ]
    },
    "genomicFeatures": {
    "type": [
    "array"
    ]
    },
    "geneticsPatient": {
    "properties": {
    "phenotypicFeatures": {
    "$ref": "#/definitions/phenotypicFeatures"
    },
    "genomicFeatures": {
    "$ref": "#/definitions/genomicFeatures"
    }
    },
    "anyOf": [
    {
    "required": [
    "phenotypicFeatures"
    ]
    },
    {
    "required": [
    {
    "patient": {
    "phenotypicFeatures": ["long nose"],
    "name": "bob"
    }
    }
    `phenotypicFeatures` and `genomicFeatures` are defined in `definitions` and are referenced by the individual properties of
    `geneticsPatient`

    View Slide

  39. allOf > $ref schemas and additionalProperties: false
    Being able to combine schema objects without additional properties is a VERY common pain point.
    We recognised this and fixed it in draft-8, but that's a whole other talk.

    View Slide

  40. What else did we discover?
    Schemas may be boolean values, even when as a value of the properties
    object
    Constructing and merging complex schemas may require some duplication
    additionalProperties CANNOT "See Through" applicator keywords
    like *Of or $ref

    View Slide

  41. Thank you all
    Previous team
    Current team
    Contributors and community
    Implementation developers
    all of you

    View Slide

  42. JSON Schema
    Core Concepts, Common Pitfalls, and Debugging
    Thank you sponsors!
    Ben Hutton
    @relequestual on the internet
    JSON Schema core
    opencollective.com/json-schema
    ☕ ko-fi.com/relequestual
    ASC 2019 - API Specification Conference 2019

    View Slide