Slide 1

Slide 1 text

Why your configuration needs a schema Gareth Rushgrove

Slide 2

Slide 2 text

@garethr

Slide 3

Slide 3 text

No content

Slide 4

Slide 4 text

- The proliferation of config file formats - The high cost of config management - Why configuration needs a schema - Auto-generate everything

Slide 5

Slide 5 text

The proliferation of configuration file formats The state of things

Slide 6

Slide 6 text

XML, INI, JSON, YAML, EDN, HOCON, TOML, CSON, Java Properties, internal DSLs, ...

Slide 7

Slide 7 text

Everyone has opinions about config file formats

Slide 8

Slide 8 text

Don’t use JSON as a Config File Format

Slide 9

Slide 9 text

No content

Slide 10

Slide 10 text

No content

Slide 11

Slide 11 text

Some formats are associated with certain languages or frameworks

Slide 12

Slide 12 text

Others become the default for communities of practice

Slide 13

Slide 13 text

Apart from the parsing bit, to the application being configured it’s just a data structure

Slide 14

Slide 14 text

For the operator all of the different formats are separate user interfaces

Slide 15

Slide 15 text

This is one of the reasons for higher-level configuration management tools

Slide 16

Slide 16 text

The cost of configuration management A barrier to entry to good tooling

Slide 17

Slide 17 text

Multiple configuration management tools is not a bad thing, but it does mean lots of reinventing the wheel

Slide 18

Slide 18 text

Everyone ends up with a way to manage packages, services, files, users and groups

Slide 19

Slide 19 text

Worse, everyone ends up with a way to manage Apache

Slide 20

Slide 20 text

sous-chef/apache2 1498 commits, 116 contributors, 46 releases puppetlabs/puppetlabs-apache 2992 commits, 342 contributors, 39 releases Ansible Galaxy 298 results for apache

Slide 21

Slide 21 text

Managing files is a big part of managing most systems

Slide 22

Slide 22 text

Using BigQuery on 7.5 million lines of Puppet

Slide 23

Slide 23 text

What types are used the most?

Slide 24

Slide 24 text

More than 30% of Puppet resources where files

Slide 25

Slide 25 text

Option 1: Templates None of the benefits of your chosen tool, and you’re exposed to all the configuration file formats directly

Slide 26

Slide 26 text

template '/etc/app/config.yaml' do source 'config.yaml.erb' mode '0755' owner 'web' group 'web' end You need a separate templating language

Slide 27

Slide 27 text

How can you reason about a system when the configuration is spread across an explosion of templating languages, file formats and templates?

Slide 28

Slide 28 text

Option 2: Format-specific resources You can now use your chosen tool, but the tool has no context for the application, it’s just data, and the format still bleeds through

Slide 29

Slide 29 text

ini_setting { "sample setting": ensure => present, path => '/tmp/foo.ini', section => 'bar', setting => 'baz', value => 'quux', } Manage an INI file with Puppet

Slide 30

Slide 30 text

A PowerShell DSC Resource for INI files

Slide 31

Slide 31 text

Import-DscResource -ModuleName DSCR_IniFile cIniFile Apple { Path = "C:\Test.ini" Section = "" Key = "Fruit_A" Value = "Apple" } Manage an INI file with DSC

Slide 32

Slide 32 text

Ansible module for INI files

Slide 33

Slide 33 text

Chef resources for JSON and YAML files

Slide 34

Slide 34 text

Option 3: App-specific resources You get all the power of your chosen tool, but at the cost of bespoke development

Slide 35

Slide 35 text

webapp "cfgmgmtcamp" do static_url_path "/my_project/static" mysql_database_user "project_user_name" show_settings_route "/show-settings" debug True end A bespoke application in Chef

Slide 36

Slide 36 text

How can we lower the cost of native resources for configuration?

Slide 37

Slide 37 text

What have schemas got to do with this? Moving on to talk about solutions

Slide 38

Slide 38 text

Most Chef cookbooks or Ansible or Puppet modules are not written by the developers of the application being managed

Slide 39

Slide 39 text

Most configuration is informally specified via implementation, and often not versioned like an API

Slide 40

Slide 40 text

What if instead applications provided a schema for their configuration?

Slide 41

Slide 41 text

Examples and demonstrations Experiments in auto generating tools

Slide 42

Slide 42 text

Kubernetes has a well-defined set of configuration primitives; Pods, Deployments, Services, ReplicationControllers, etc.

Slide 43

Slide 43 text

Kubernetes uses OpenAPI to describe the API

Slide 44

Slide 44 text

OpenAPI uses JSON Schema internally

Slide 45

Slide 45 text

Kubernetes JSON Schema

Slide 46

Slide 46 text

That’s a lot of JSON PS> Get-Content -Path swagger.json | Measure-Object -line).Lines 85340 PS> (Get-ChildItem -Path v*/*.json -Recurse | Measure-Object).Count 26181 PS> (Get-ChildItem -Path v*/*.json -Recurse | Get-Content | Measure-Object -line).Lines 7296392

Slide 47

Slide 47 text

Generated Puppet types and providers

Slide 48

Slide 48 text

Generated jsonnet templates

Slide 49

Slide 49 text

Generated ICL templates

Slide 50

Slide 50 text

Validation tools

Slide 51

Slide 51 text

Programming language clients

Slide 52

Slide 52 text

Could we have this for any application?

Slide 53

Slide 53 text

A simple example application from the internet

Slide 54

Slide 54 text

{ "STATIC_URL_PATH": "/my_project/static", "MYSQL_DATABASE_USER": "project_user_name", "SHOW_SETTINGS_ROUTE": "/show-settings", "DEBUG": true } Our application has a configuration file

Slide 55

Slide 55 text

{ "definitions": {}, "$schema": "http://json-schema.org/draft-06/schema#", "id": "app_config", "title": "app_config", "type": "object", "additionalProperties": false, "required": [ "STATIC_URL_PATH", "MYSQL_DATABASE_USER" ], "properties": { "STATIC_URL_PATH": { "$id": "/properties/STATIC_URL_PATH", "type": "string", "title": "Static URL path", "description": "A filesystem path for static assets", Let’s write a (JSON) schema

Slide 56

Slide 56 text

"additionalProperties": false, Only allow the defined properties

Slide 57

Slide 57 text

"required": [ "STATIC_URL_PATH", "MYSQL_DATABASE_USER" ], These properties are required

Slide 58

Slide 58 text

"STATIC_URL_PATH": { "$id": "/properties/STATIC_URL_PATH", "type": "string", "title": "Static URL path", "description": "A filesystem path for static assets", "examples": [ "/my_project/static" ] }, Describe each individual property

Slide 59

Slide 59 text

Validate config using the schemas $ jsonschema -F "{error.message}" -i app.json schema.json u'STATIC_URL_PATH' is a required property

Slide 60

Slide 60 text

We have a schema. Now what?

Slide 61

Slide 61 text

Validate arbitrary structures with JSON Schema import json import fastjsonschema data = { "STATIC_URL_PATH": "/my_project/static", "MYSQL_DATABASE_USER": "project_user_name", "SHOW_SETTINGS_ROUTE": "/show-settings", "DEBUG": True, } validate = fastjsonschema.compile(json.load(open('schema.json'))) validate(data) print(json.dumps(data))

Slide 62

Slide 62 text

The JSON in JSON Schema refers to the syntax for the schema. It can be used to validate data in other formats

Slide 63

Slide 63 text

Generate browser-based user interfaces

Slide 64

Slide 64 text

Generate interactive documentation

Slide 65

Slide 65 text

Generate models in different languages

Slide 66

Slide 66 text

Quicktype generating Simple Types $ docker run -v ${PWD}:/pwd quicktype -l types -s schema /pwd/schemas/schema.json class Schema { staticURLPath: String mysqlDatabaseUser: String showSettingsRoute: Maybe debug: Maybe }

Slide 67

Slide 67 text

Quicktype generating Go $ docker run -v ${PWD}:/pwd quicktype -l go -s schema /pwd/schemas/schema.json // To parse and unparse this JSON data, add this code to your project and do: // // r, err := UnmarshalSchema(bytes) // bytes, err = r.Marshal() package main import "encoding/json" func UnmarshalSchema(data []byte) (Schema, error) { var r Schema err := json.Unmarshal(data, &r) return r, err } func (r *Schema) Marshal() ([]byte, error) {

Slide 68

Slide 68 text

Quicktype currently supports generating TypeScript, Elm, Java, C#, Go, Swift and C++

Slide 69

Slide 69 text

Python JSON Schema Objects

Slide 70

Slide 70 text

Dynamically build objects from schemas import python_jsonschema_objects as pjs import json schema = json.load(open('schema.json')) builder = pjs.ObjectBuilder(schema) ns = builder.build_classes() Config = ns.AppConfig config = Config( STATIC_URL_PATH="/static", MYSQL_DATABASE_USER="db", )

Slide 71

Slide 71 text

What if we could generate Puppet types, Chef resources, Libral providers, Ansible modules, etc.

Slide 72

Slide 72 text

Live Demo Klaxon

Slide 73

Slide 73 text

Generate Chef resource from schema $ ./to_chef.py resource_name :app_config property :path, String, name_property: true property :static_url_path, String, required: true property :mysql_database_user, String, required: true property :show_settings_route, String, default: '/settings' property :debug, Boolean action :create do file path do content "{ STATIC_URL_PATH: "#{static_url_path}", MYSQL_DATABASE_USER: "#{mysql_database_user}", SHOW_SETTINGS_ROUTE: "#{show_settings_route}", DEBUG: #{debug} }"

Slide 74

Slide 74 text

Generate Libral provider from schema $ ./to_libral.py #! /usr/bin/python import json import sys import os METADATA=""" --- provider: type: app_config invoke: json actions: [get,set] suitable: true attributes: path: desc: The filepath for the configuration file

Slide 75

Slide 75 text

Generate Puppet type from schema $ ./to_puppet.py Puppet::Type.newtype(:app_config) do ensurable validate do required_properties = [ :static_url_path, :mysql_database_user, ] required_properties.each do |property| if self[property].nil? and self.provider.send(property) == :absent fail "You must provide a #{property}" end end end newparam(:path, namevar: true) do

Slide 76

Slide 76 text

Conclusions If all you remember is...

Slide 77

Slide 77 text

Schemas can allow for greater portability, and improved interoperability, between tools

Slide 78

Slide 78 text

If you’re building applications, consider writing a schema to describe your configuration

Slide 79

Slide 79 text

If you’re building configuration management tools consider relying a lot more on auto-generation

Slide 80

Slide 80 text

Any questions? And thanks for listening