Upgrade to Pro — share decks privately, control downloads, hide ads and more …

JSON Schema - Core Concepts, Common Pitfalls, and Debugging

Ben Hutton
October 16, 2019

JSON Schema - Core Concepts, Common Pitfalls, and Debugging

Access the interactive deck at https://stoic-agnesi-d0ac4a.netlify.com

Fund more of these talks: https://opencollective.com/json-schema
Tip this talk: https://ko-fi.com/relequestual

JSON Schema appears pretty simple on the surface, and it can be used to create really simple validation, but sometimes there are more complex structures you want to validate.
Looking after the official JSON Schema slack server for the past few years has highlighted a number of common problems and pitfalls.

You'll be taken on a journey of disovery to see how reframing your understanding of JSON Schema documents can help avoid common problems and pitfalls, and how to understand core concepts such as applicability.

Ben Hutton

October 16, 2019
Tweet

More Decks by Ben Hutton

Other Decks in Technology

Transcript

  1. JSON Schema
    Core Concepts, Common Pitfalls, and Debugging
    Ben Hutton
    @relequestual on the internet
    JSON Schema core
    opencollective.com/json-schema
    ☕ ko-fi.com/relequestual
    ASC 2019 - API Specification Conference 2019
    These slides have interactive code which does not display fully in PDF.
    Please see https://stoic-agnesi-d0ac4a.netlify.com for the full version.
    All code used and created for this deck is avilable on github.

    View full-size slide

  2. Validation saves lives

    View full-size slide

  3. Validation (probably?) saves lives

    View full-size slide

  4. Validation changes lives
    Validation (probably?) saves lives

    View full-size slide

  5. Matchmaker Exchange API
    Siloed databases of patient data
    (including DNA variants)
    Rare and undiagnosed cases
    (> 10 per country rare)
    Complex logal and ethical situation
    Discover and exchange pateint data

    View full-size slide

  6. Matchmaker Exchange API
    Each database holds slightly different data
    Representation of a patient needs to be uniform
    Write specification using words and examples
    Release v1.0!
    Sounds easy, right?

    View full-size slide

  7. Building a schema

    View full-size slide

  8. Requirements for queries
    Root object must have a patient
    patient must have either properties
    genomicFeatures or
    phenotypicFeatures or
    both

    View full-size slide

  9. {
    "$schema": "http://json-schema.org/draft-07/schema",
    "title": "MatchMakerExchange format for queries",
    "patient": {
    "$comment": "`patient`: `genomicFeatures` or `phenotypicFeatures` or both",
    "anyOf": [
    "genomicFeatures",
    "phenotypicFeatures"
    ]
    }
    }
    This is an unknown keyword. Unknown keywords are ignored.

    View full-size slide

  10. What's a "schema" anyway?

    View full-size slide

  11. JSON Schema
    A vocabulary that allows you to annotate and validate JSON documents.
    An IETF "personal draft" document specification
    JSON!
    Must be a boolean or an object

    View full-size slide

  12. JSON Schema
    This talk covers draft-7. Not draft-4, not draft-8
    draft-8 "draft 2019-09 " is out, but implementations need to catch up
    draft-7 is the latest well supported version

    View full-size slide

  13. Key Concepts
    Schema “keywords” : Object properties that are applied to the instance
    Instance : The JSON document being validated or associated with a Schema
    Root Schema : A Schema that is the whole JSON document
    Subschema : A Schema as a value of an object or array
    Keywords fall under several behavior categories which include:
    Assertions : produce a boolean result when applied to an instance
    Annotations : attach information to an instance for application use

    View full-size slide

  14. "properties" keyword
    properties is an object
    The values of the object are applied to the instance's matching key's value

    View full-size slide

  15. {
    "$schema": "http://json-schema.org/draft-07/schema",
    "title": "MatchMakerExchange format for queries",
    "properties": {
    "patient": {
    "$comment": "`patient`: `genomicFeatures` or `phenotypicFeatures` or both",
    "anyOf": [
    {
    "properties": {
    "phenotypicFeatures": true
    }
    },
    {
    "properties": {
    "genomicFeatures": true
    }
    }
    ]
    }
    }
    }
    Remember, the values of a `properties` object must be Schemas. A Boolean is a valid Schema.

    View full-size slide

  16. A Boolean can be a schema?

    View full-size slide

  17. What's a Schema again? (pt 2)

    View full-size slide

  18. An Object or a Boolean
    Always passes validation
    {}
    An empty object
    true
    a `true` boolean value
    Always fails validation
    { "not": {} }
    `not` keyword: inverts the assertion
    false
    a `false` boolean value

    View full-size slide

  19. Does it work?
    {
    "$schema": "http://json-schema.org/draft-07/schema",
    "title": "MatchMakerExchange format for queries",
    "properties": {
    "patient": {
    "anyOf": [
    {
    "properties": {
    "phenotypicFeatures": {
    "type": [
    "array"
    ]
    }
    }
    },
    {
    "properties": {
    "genomicFeatures": {
    "type": [
    "array"
    ]
    }
    }
    }
    ]
    }
    }
    {
    "patient": {
    "phenotypicFeatures": "long nose"
    }
    }
    But didn't I define that should be an array?

    View full-size slide

  20. (More) Key Concepts
    Applicator keywords: Determine how subschemas are applied to an instance location.
    Include, but not limited to;
    oneOf , allOf , and anyOf

    View full-size slide

  21. *Of applicators
    The *Of keywords values must be an array of Schemas.
    The result of applying a schema to an instance includes an assertion of validity.
    The resulting assertions are modified or combined to produce a final result.
    For example, oneOf requires that ONLY one of the Schemas in the array is valid.

    View full-size slide

  22. Why doesn't it work?
    {
    "$schema": "http://json-schema.org/draft-07/schema",
    "title": "MatchMakerExchange format for queries",
    "properties": {
    "patient": {
    "anyOf": [
    {
    "properties": {
    "phenotypicFeatures": {
    "type": [
    "array"
    ]
    }
    },
    "additionalProperties": false
    },
    {
    "properties": {
    "genomicFeatures": {
    "type": [
    "array"
    ]
    }
    },
    "additionalProperties": false
    }
    ]
    }
    }
    }
    {
    "patient": {
    "phenotypicFeatures": "long nose"
    }
    }
    ...
    ❌ YES! Validation fails, as expected

    View full-size slide

  23. We fixed it!
    Almost

    View full-size slide

  24. {
    "$schema": "http://json-schema.org/draft-07/schema",
    "title": "MatchMakerExchange format for queries",
    "properties": {
    "patient": {
    "anyOf": [
    {
    "required": [
    "phenotypicFeatures"
    ],
    "properties": {
    "phenotypicFeatures": {
    "type": [
    "array"
    ]
    }
    },
    "additionalProperties": false
    },
    {
    "required": [
    "genomicFeatures"
    ],
    "properties": {
    "genomicFeatures": {
    "type": [
    "array"
    ]
    }
    },
    "additionalProperties": false
    }
    ]
    }
    {
    "patient": {
    }
    }
    ❌ Validation now fails as expected!

    View full-size slide

  25. What did we discover?
    Some keywords have values that are a JSON Schema (a subschema)
    *Of keywords are applicators that take an array of subschemas
    You can set the value of a subschema to a boolean false to check the
    validation path is as you expect
    The values of a properties object are subschemas which are applied to
    an objects values for matching keys

    View full-size slide

  26. Spec change!
    That doesn't happen, does it?

    View full-size slide

  27. Spec change
    Tidy the subschemas into definitions
    Refactor the schema
    patient now needs to incldue some additional fields. A definition is
    provided.

    View full-size slide

  28. Let's do this

    View full-size slide

  29. {
    "$schema": "http://json-schema.org/draft-07/schema",
    "title": "MatchMakerExchange format for queries",
    "definitions": {
    "phenotypicFeatures": {
    "type": [
    "array"
    ]
    },
    "genomicFeatures": {
    "type": [
    "array"
    ]
    },
    "geneticsPatient": {
    "properties": {
    "phenotypicFeatures": {
    "$ref": "#/definitions/phenotypicFeatures"
    },
    "genomicFeatures": {
    "$ref": "#/definitions/genomicFeatures"
    }
    },
    "additionalProperties": false,
    "anyOf": [
    {
    "required": [
    "phenotypicFeatures"
    ]
    },
    {
    "required": [
    "genomicFeatures"
    ]
    }
    ]
    },
    "regularPatient": {
    "type": "object",
    "required": [
    "name"
    ],
    "properties": {
    "name": {
    "type": [
    "string"
    ]
    }
    }
    }
    },
    "properties": {
    "patient": {
    "allOf": [
    {
    "$ref": "#/definitions/regularPatient"
    },
    {
    "$ref": "#/definitions/geneticsPatient"
    }
    ]
    Validation always fails... how do we fix this?

    View full-size slide

  30. Validation Error:
    should NOT have additional properties.
    additionalProperties at "#/additionalProperties"
    Instance location: "/patient"
    {
    "patient": {
    "phenotypicFeatures": ["long nose"],
    "name": "bob"
    }
    }
    Does it work?
    ❌ Validation always fails... how do we fix this?

    View full-size slide

  31. required : [
    "genomicFeatures"
    ]
    }
    ]
    },
    "regularPatient": {
    "type": "object",
    "required": [
    "name"
    ],
    "properties": {
    "name": {
    "type": [
    "string"
    ]
    }
    }
    }
    },
    "properties": {
    "patient": {
    ... and from `geneticsPatient`, but no others.

    View full-size slide

  32. additionalProperties
    What specifically does additionalProperties DO?
    additionalProperties takes a schema as it's value. Often this is boolean false .
    The schema is only applied to the instance object's values, where the keys have not already been
    evaluated as a result of properties or patternProperties within the same schema
    object.
    additionalProperties cannot "see through" applicator keywords (such as allOf and
    $ref references)
    Validation with "additionalProperties" applies only to the child
    values of instance names that do not match any names in "properties",
    and do not match any regular expression in "patternProperties".

    View full-size slide

  33. type : [
    "string"
    ]
    }
    }
    }
    },
    "properties": {
    "patient": {
    "additionalProperties": false,
    "properties": {
    "name": true,
    "phenotypicFeatures": true,
    "genomicFeatures": true
    },
    "allOf": [
    {
    "$ref": "#/definitions/regularPatient"
    },
    {
    "$ref": "#/definitions/geneticsPatient"
    }
    The only solution is to add `properties`, but we don't want to define any constraints on their value in
    this subschema

    View full-size slide

  34. We have our schema
    Let's recap how it works

    View full-size slide

  35. {
    "$schema": "http://json-schema.org/draft-07/schema",
    "title": "MatchMakerExchange format for queries",
    "definitions": {
    "phenotypicFeatures": {
    "type": [
    "array"
    ]
    },
    "genomicFeatures": {
    "type": [
    "array"
    ]
    },
    "geneticsPatient": {
    "properties": {
    "phenotypicFeatures": {
    "$ref": "#/definitions/phenotypicFeatures"
    },
    "genomicFeatures": {
    "$ref": "#/definitions/genomicFeatures"
    }
    },
    "anyOf": [
    {
    "required": [
    "phenotypicFeatures"
    ]
    },
    {
    "required": [
    {
    "patient": {
    "phenotypicFeatures": ["long nose"],
    "name": "bob"
    }
    }
    `phenotypicFeatures` and `genomicFeatures` are defined in `definitions` and are referenced by the individual properties of
    `geneticsPatient`

    View full-size slide

  36. allOf > $ref schemas and additionalProperties: false
    Being able to combine schema objects without additional properties is a VERY common pain point.
    We recognised this and fixed it in draft-8, but that's a whole other talk.

    View full-size slide

  37. What else did we discover?
    Schemas may be boolean values, even when as a value of the properties
    object
    Constructing and merging complex schemas may require some duplication
    additionalProperties CANNOT "See Through" applicator keywords
    like *Of or $ref

    View full-size slide

  38. Thank you all
    Previous team
    Current team
    Contributors and community
    Implementation developers
    all of you

    View full-size slide

  39. JSON Schema
    Core Concepts, Common Pitfalls, and Debugging
    Thank you sponsors!
    Ben Hutton
    @relequestual on the internet
    JSON Schema core
    opencollective.com/json-schema
    ☕ ko-fi.com/relequestual
    ASC 2019 - API Specification Conference 2019

    View full-size slide