Slide 1

Slide 1 text

The classification problem: challenges and solutions Marco Marongiu (@brontolinux)

Slide 2

Slide 2 text

Acknowledgements All the images and logos used in this presentation are shown for didactic purposes only, in the belief that it constitutes a fair use of these resources. I don't claim any right on the images and products shown in this presentation. If you are the owner of any material used in this presentation and you believe that I am abusing it, please let me know and I'll promptly remove it from all my sources.

Slide 3

Slide 3 text

Agenda ➔ Why classification is critical ➔ Why classification is difficult ➔ How we can make it sane ➔ Final take-aways

Slide 4

Slide 4 text

Part 1: the problem

Slide 5

Slide 5 text

!!! !!! !!! !!! !!! !!! !!!

Slide 6

Slide 6 text

Configuration management «Configuration Management is the process of identifying and defining the items in the system, controlling the change of these items throughout their life-cycle, recording and reporting the status of items and change requests, and verifying the completeness and correctness of items» -- IEEE Glossary of Software Engineering Terminology (Standard 729-1983)

Slide 7

Slide 7 text

Configuration management «Configuration Management is the process of identifying and defining the items in the system, controlling the change of these items throughout their life-cycle, recording and reporting the status of items and change requests, and verifying the completeness and correctness of items» -- IEEE Glossary of Software Engineering Terminology (Standard 729-1983)

Slide 8

Slide 8 text

In configuration management you don't apply a configuration to a system, you rather apply it to classes of systems!

Slide 9

Slide 9 text

Part 2: challenges

Slide 10

Slide 10 text

Exceptions are the rule

Slide 11

Slide 11 text

Generic settings, the same for all systems Amsterdam DNS server NTP server Syslog server SSH configs Oslo Other random exceptions Seattle Iceland

Slide 12

Slide 12 text

Internal classification doesn't scale definitions explosion difficult reporting «The business should define what systems belong in which classes. The Cfengine administrator should build policy. Once the Cfengine administrator is left to manually defining classes within policy, you become a bottleneck» -- M.Svoboda, LinkedIn human bottleneck

Slide 13

Slide 13 text

Part 3: Solutions

Slide 14

Slide 14 text

Infrastructure complexity low complexity high complexity Technical skills well versed non technical Basic interface & basic backend Basic interface & sophisticated backend Sophisticated interface & sophisticated backend Sophisticated interface & basic backend

Slide 15

Slide 15 text

The LinkedIn solution M.Svoboda - LinkedIn "Leveraging In-Memory Key Value Stores for Large-Scale Operations"

Slide 16

Slide 16 text

our implementation: hENC power & simplicity config info in plain text files module protocol a simple Perl script

Slide 17

Slide 17 text

No content

Slide 18

Slide 18 text

CFEngine's module protocol +activated_class -cancelled_class =my_var=my value =@my_list={ 'list','of','4','values'} =my_array[element]=value @my_array[list]={ 'list','of','4','values'}

Slide 19

Slide 19 text

#!/bin/sh /bin/egrep -h ^[=@+-] $* 2> /dev/null

Slide 20

Slide 20 text

#!/usr/bin/perl use strict ; use warnings ; my %class ; # classes container my %variable ; # variables container # Silence errors (e.g.: missing files) close STDERR ; while (my $line = <>) { chomp $line ; my ($setting,$id) = ( $line =~ m{^\s*([=\@+-])(.+)\s*$} ) ; # line didn't match the module protocol next if not defined $setting ; # add a class if ($setting eq '+') { # $id is a class name, or should be. $class{$id} = 1 ; } # undefine a class if ($setting eq '-') { # $id is a class name, or should be. $class{$id} = -1 ; } # define a variable/list if ($setting eq '=' or $setting eq '@') { # $id is "variable = something", or should be my ($varname) = ( $id =~ m{^(.+?)=} ) ; $variable{$varname} = $line ; } # discard the rest } # print out classes foreach my $classname (keys %class) { print "+$classname\n" if $class{$classname} > 0 ; print "-$classname\n" if $class{$classname} < 0 ; } # print variable/list assignments, the last one wins foreach my $assignment (values %variable) { print "$assignment\n" ; }

Slide 21

Slide 21 text

read general defaults read location defaults read environment defaults read node settings

Slide 22

Slide 22 text

oslo_public:: "enc_subdir" policy => "overridable", string => "$(enc_basedir)/pub" ; on_private_net_only:: "enc_subdir" policy => "overridable", string => "$(enc_basedir)/priv" ; oslo:: "henclist" policy => "overridable", slist => { "$(enc_basedir)/_default_", "$(enc_basedir)/_oslo_", "$(enc_subdir)/_oslo_", "$(enc_subdir)/$(sys.domain)/$(sys.fqhost)", } ; }environment- dependent settings } general defaults } location defaults } environ. defaults

Slide 23

Slide 23 text

methods: "ENC" comment => "External node classification", usebundle => henc("site.henclist") ; vars: "motd_file" string => "$(henc.motd_file)", policy => "overridable" ;

Slide 24

Slide 24 text

final take-aways smart classification is crucial... ...but doesn't need to be complicated... ...unless your infrastructure is. sometimes, even plain text is good enough!

Slide 25

Slide 25 text

No content

Slide 26

Slide 26 text

comments? @brontolinux [email protected] http://syslog.me http://no.linkedin.com/in/marcomarongiu/

Slide 27

Slide 27 text

THANK you