Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Why your configuration needs a schema

Why your configuration needs a schema

Talk from Configuration Management Camp 2018, all about the high cost of current configuration management approaches, why that leads to serialisation formats like YAML and JSON being edited directly, and how schemas and auto-generation can help.

Gareth Rushgrove

February 06, 2018
Tweet

More Decks by Gareth Rushgrove

Other Decks in Technology

Transcript

  1. Why your configuration
    needs a schema
    Gareth Rushgrove

    View Slide

  2. @garethr

    View Slide

  3. View Slide

  4. - The proliferation of config file formats
    - The high cost of config management
    - Why configuration needs a schema
    - Auto-generate everything

    View Slide

  5. The proliferation of
    configuration file formats
    The state of things

    View Slide

  6. XML, INI, JSON, YAML, EDN,
    HOCON, TOML, CSON, Java
    Properties, internal DSLs, ...

    View Slide

  7. Everyone has opinions about
    config file formats

    View Slide

  8. Don’t use JSON as a Config File Format

    View Slide

  9. View Slide

  10. View Slide

  11. Some formats are associated with
    certain languages or frameworks

    View Slide

  12. Others become the default for
    communities of practice

    View Slide

  13. Apart from the parsing bit, to the
    application being configured it’s
    just a data structure

    View Slide

  14. For the operator all of the different
    formats are separate user interfaces

    View Slide

  15. This is one of the reasons for
    higher-level configuration
    management tools

    View Slide

  16. The cost of configuration
    management
    A barrier to entry to good tooling

    View Slide

  17. Multiple configuration management
    tools is not a bad thing, but it does
    mean lots of reinventing the wheel

    View Slide

  18. Everyone ends up with a way to
    manage packages, services,
    files, users and groups

    View Slide

  19. Worse, everyone ends up with a
    way to manage Apache

    View Slide

  20. sous-chef/apache2
    1498 commits, 116 contributors, 46 releases
    puppetlabs/puppetlabs-apache
    2992 commits, 342 contributors, 39 releases
    Ansible Galaxy
    298 results for apache

    View Slide

  21. Managing files is a big part of
    managing most systems

    View Slide

  22. Using BigQuery on 7.5 million lines of Puppet

    View Slide

  23. What types are used the most?

    View Slide

  24. More than 30% of Puppet
    resources where files

    View Slide

  25. Option 1: Templates
    None of the benefits of your chosen
    tool, and you’re exposed to all the
    configuration file formats directly

    View Slide

  26. template '/etc/app/config.yaml' do
    source 'config.yaml.erb'
    mode '0755'
    owner 'web'
    group 'web'
    end
    You need a separate templating language

    View Slide

  27. How can you reason about a system
    when the configuration is spread
    across an explosion of templating
    languages, file formats and templates?

    View Slide

  28. Option 2: Format-specific resources
    You can now use your chosen tool,
    but the tool has no context for the
    application, it’s just data, and the
    format still bleeds through

    View Slide

  29. ini_setting { "sample setting":
    ensure => present,
    path => '/tmp/foo.ini',
    section => 'bar',
    setting => 'baz',
    value => 'quux',
    }
    Manage an INI file with Puppet

    View Slide

  30. A PowerShell DSC Resource for INI files

    View Slide

  31. Import-DscResource -ModuleName DSCR_IniFile
    cIniFile Apple {
    Path = "C:\Test.ini"
    Section = ""
    Key = "Fruit_A"
    Value = "Apple"
    }
    Manage an INI file with DSC

    View Slide

  32. Ansible module for INI files

    View Slide

  33. Chef resources for JSON and YAML files

    View Slide

  34. Option 3: App-specific resources
    You get all the power of your
    chosen tool, but at the cost of
    bespoke development

    View Slide

  35. webapp "cfgmgmtcamp" do
    static_url_path "/my_project/static"
    mysql_database_user "project_user_name"
    show_settings_route "/show-settings"
    debug True
    end
    A bespoke application in Chef

    View Slide

  36. How can we lower the cost of native
    resources for configuration?

    View Slide

  37. What have schemas got to
    do with this?
    Moving on to talk about solutions

    View Slide

  38. Most Chef cookbooks or Ansible or
    Puppet modules are not written by
    the developers of the application
    being managed

    View Slide

  39. Most configuration is informally
    specified via implementation, and
    often not versioned like an API

    View Slide

  40. What if instead applications provided
    a schema for their configuration?

    View Slide

  41. Examples and
    demonstrations
    Experiments in auto generating tools

    View Slide

  42. Kubernetes has a well-defined set of
    configuration primitives; Pods,
    Deployments, Services,
    ReplicationControllers, etc.

    View Slide

  43. Kubernetes uses OpenAPI to describe the API

    View Slide

  44. OpenAPI uses JSON Schema internally

    View Slide

  45. Kubernetes JSON Schema

    View Slide

  46. That’s a lot of JSON
    PS> Get-Content -Path swagger.json | Measure-Object -line).Lines
    85340
    PS> (Get-ChildItem -Path v*/*.json -Recurse | Measure-Object).Count
    26181
    PS> (Get-ChildItem -Path v*/*.json -Recurse | Get-Content | Measure-Object -line).Lines
    7296392

    View Slide

  47. Generated Puppet types and providers

    View Slide

  48. Generated jsonnet templates

    View Slide

  49. Generated ICL templates

    View Slide

  50. Validation tools

    View Slide

  51. Programming language clients

    View Slide

  52. Could we have this for any
    application?

    View Slide

  53. A simple example application from the internet

    View Slide

  54. {
    "STATIC_URL_PATH": "/my_project/static",
    "MYSQL_DATABASE_USER": "project_user_name",
    "SHOW_SETTINGS_ROUTE": "/show-settings",
    "DEBUG": true
    }
    Our application has a configuration file

    View Slide

  55. {
    "definitions": {},
    "$schema": "http://json-schema.org/draft-06/schema#",
    "id": "app_config",
    "title": "app_config",
    "type": "object",
    "additionalProperties": false,
    "required": [
    "STATIC_URL_PATH",
    "MYSQL_DATABASE_USER"
    ],
    "properties": {
    "STATIC_URL_PATH": {
    "$id": "/properties/STATIC_URL_PATH",
    "type": "string",
    "title": "Static URL path",
    "description": "A filesystem path for static assets",
    Let’s write a (JSON) schema

    View Slide

  56. "additionalProperties": false,
    Only allow the defined properties

    View Slide

  57. "required": [
    "STATIC_URL_PATH",
    "MYSQL_DATABASE_USER"
    ],
    These properties are required

    View Slide

  58. "STATIC_URL_PATH": {
    "$id": "/properties/STATIC_URL_PATH",
    "type": "string",
    "title": "Static URL path",
    "description": "A filesystem path for static assets",
    "examples": [
    "/my_project/static"
    ]
    },
    Describe each individual property

    View Slide

  59. Validate config using the schemas
    $ jsonschema -F "{error.message}" -i app.json schema.json
    u'STATIC_URL_PATH' is a required property

    View Slide

  60. We have a schema. Now what?

    View Slide

  61. Validate arbitrary structures with JSON Schema
    import json
    import fastjsonschema
    data = {
    "STATIC_URL_PATH": "/my_project/static",
    "MYSQL_DATABASE_USER": "project_user_name",
    "SHOW_SETTINGS_ROUTE": "/show-settings",
    "DEBUG": True,
    }
    validate = fastjsonschema.compile(json.load(open('schema.json')))
    validate(data)
    print(json.dumps(data))

    View Slide

  62. The JSON in JSON Schema refers to
    the syntax for the schema. It can be
    used to validate data in other formats

    View Slide

  63. Generate browser-based user interfaces

    View Slide

  64. Generate interactive documentation

    View Slide

  65. Generate models in different languages

    View Slide

  66. Quicktype generating Simple Types
    $ docker run -v ${PWD}:/pwd quicktype -l types -s schema /pwd/schemas/schema.json
    class Schema {
    staticURLPath: String
    mysqlDatabaseUser: String
    showSettingsRoute: Maybe
    debug: Maybe
    }

    View Slide

  67. Quicktype generating Go
    $ docker run -v ${PWD}:/pwd quicktype -l go -s schema /pwd/schemas/schema.json
    // To parse and unparse this JSON data, add this code to your project and do:
    //
    // r, err := UnmarshalSchema(bytes)
    // bytes, err = r.Marshal()
    package main
    import "encoding/json"
    func UnmarshalSchema(data []byte) (Schema, error) {
    var r Schema
    err := json.Unmarshal(data, &r)
    return r, err
    }
    func (r *Schema) Marshal() ([]byte, error) {

    View Slide

  68. Quicktype currently supports
    generating TypeScript, Elm, Java, C#,
    Go, Swift and C++

    View Slide

  69. Python JSON Schema Objects

    View Slide

  70. Dynamically build objects from schemas
    import python_jsonschema_objects as pjs
    import json
    schema = json.load(open('schema.json'))
    builder = pjs.ObjectBuilder(schema)
    ns = builder.build_classes()
    Config = ns.AppConfig
    config = Config(
    STATIC_URL_PATH="/static",
    MYSQL_DATABASE_USER="db",
    )

    View Slide

  71. What if we could generate Puppet
    types, Chef resources, Libral
    providers, Ansible modules, etc.

    View Slide

  72. Live Demo Klaxon

    View Slide

  73. Generate Chef resource from schema
    $ ./to_chef.py
    resource_name :app_config
    property :path, String, name_property: true
    property :static_url_path, String, required: true
    property :mysql_database_user, String, required: true
    property :show_settings_route, String, default: '/settings'
    property :debug, Boolean
    action :create do
    file path do
    content "{
    STATIC_URL_PATH: "#{static_url_path}",
    MYSQL_DATABASE_USER: "#{mysql_database_user}",
    SHOW_SETTINGS_ROUTE: "#{show_settings_route}",
    DEBUG: #{debug}
    }"

    View Slide

  74. Generate Libral provider from schema
    $ ./to_libral.py
    #! /usr/bin/python
    import json
    import sys
    import os
    METADATA="""
    ---
    provider:
    type: app_config
    invoke: json
    actions: [get,set]
    suitable: true
    attributes:
    path:
    desc: The filepath for the configuration file

    View Slide

  75. Generate Puppet type from schema
    $ ./to_puppet.py
    Puppet::Type.newtype(:app_config) do
    ensurable
    validate do
    required_properties = [
    :static_url_path,
    :mysql_database_user,
    ]
    required_properties.each do |property|
    if self[property].nil? and self.provider.send(property) == :absent
    fail "You must provide a #{property}"
    end
    end
    end
    newparam(:path, namevar: true) do

    View Slide

  76. Conclusions
    If all you remember is...

    View Slide

  77. Schemas can allow for greater
    portability, and improved
    interoperability, between tools

    View Slide

  78. If you’re building applications,
    consider writing a schema to
    describe your configuration

    View Slide

  79. If you’re building configuration
    management tools consider relying a
    lot more on auto-generation

    View Slide

  80. Any questions?
    And thanks for listening

    View Slide