The classification problem: challenges and solutions (ed. 2015)

The classification problem: challenges and solutions (ed. 2015)

Participants will learn:
- Why classification is critical
- Why classification is difficult
- How we can make it sane
All the participants will learn the power, traps and pitfalls of classification, regardless of the configuration management tool they use.
The simple ENC solution that we implemented in Opera will be fully described; CFEngine users will have the ability to reuse it in their own environments, while users of other tools will be able to rethink their ENC solutions from a fresh perspective.

Df0cf054b01391a0f3113cc8e1048662?s=128

Marco Marongiu

February 12, 2015
Tweet

Transcript

  1. The classification problem: challenges and solutions Marco Marongiu (@brontolinux)

  2. Agenda ➔ Why classification is critical ➔ Why classification is

    difficult ➔ How we can make it sane ➔ Final take-aways
  3. Part 1: the problem

  4. !!! !!! !!! !!! !!!

  5. Configuration management «Configuration Management is the process of identifying and

    defining the items in the system, controlling the change of these items throughout their life-cycle, recording and reporting the status of items and change requests, and verifying the completeness and correctness of items» -- IEEE Glossary of Software Engineering Terminology (Standard 729-1983)
  6. Configuration management «Configuration Management is the process of identifying and

    defining the items in the system, controlling the change of these items throughout their life-cycle, recording and reporting the status of items and change requests, and verifying the completeness and correctness of items» -- IEEE Glossary of Software Engineering Terminology (Standard 729-1983)
  7. In configuration management you don't apply a configuration to a

    system, you rather apply it to classes of systems!
  8. Part 2: challenges

  9. Exceptions are the rule

  10. Generic settings, the same for all systems Amsterdam DNS server

    NTP server Syslog server SSH configs Oslo Other random exceptions Seattle Iceland
  11. Internal classification doesn't scale definitions explosion difficult reporting «The business

    should define what systems belong in which classes. The Cfengine administrator should build policy. Once the Cfengine administrator is left to manually defining classes within policy, you become a bottleneck» -- M.Svoboda, LinkedIn human bottleneck
  12. Definitions explosion class1 exception1 exception2

  13. Definitions explosion class1 exception1 exception3 exception2 c1&!(ex1|ex2|ex3) ex1&!(ex2|ex3) ex2&!(ex1|ex3) ex3&!(ex1|ex2)

    (ex1&ex2)&!ex3 (ex2&ex3)&!ex1 ex1&ex2&ex3 (ex1&ex3)&!ex2
  14. Reporting Not a problem caused by bad classification alone it's

    actually an indication that you lack a good inventory and compliance reporting system, but if it applies to you... If all the classes are defined in configuration files, and the class definitions are not trivial... ...how easy is it to tell at a glance if a host belongs to a certain class?
  15. Reporting ✔ take an infrastructure of 30000 nodes (LinkedIn-size!) ✔

    try to report on something using SSH in a loop ✔ if no problem happens, and it takes ~3s per server on average, it takes 90000 seconds (>24h) ✔ if you do it with 10 parallel proceses, it's still 2.5h ✗ ...and you haven't started analysing the data yet ✗ ...and you haven't made it available to management yet ✗ ...and you need an up-to-date inventory of all nodes to provide accurate results ✗ ...and since not everybody has SSH access to the nodes or to the inventory, you are the only one (and the bottleneck) who can make that report! ✗ ...and it's an ideal case!
  16. Your task is building policies, not defining classes

  17. What if you had the ability to write a policy

    and use a class within that policy -- but the business could control which machines were associated with that class external to Cfengine? [...] The business can deploy an application to a machine, and Cfengine automatically responds in its policy because new classes have been set on the client automatically. Other business units can create classes of machines, and file a ticket for you to implement XYZ against the class they created. You are allowing your business to work the way it wants to. -- M.Svoboda, LinkedIn
  18. Part 3: Solutions

  19. Infrastructure complexity low complexity high complexity Technical skills well versed

    non technical Basic interface & basic backend Basic interface & sophisticated backend Sophisticated interface & sophisticated backend Sophisticated interface & basic backend
  20. Puppet and ENC An external node classifier is an arbitrary

    script or application which can tell Puppet which classes a node should have [...] that can be called by puppet master; it doesn’t have to be written in Ruby. Its only argument is the name of the node to be classified, and it returns a YAML document describing the node. [...] To tell puppet master to use an ENC, you need to set two configuration options: node_terminus has to be set to “exec”, and external_nodes should have the path to the executable.
  21. m odules! ENC in CFEngine means...

  22. our implementation: hENC power & simplicity config info in plain

    text files module protocol a simple Perl script
  23. None
  24. CFEngine's module protocol (up to version 3.5) +activated_class -cancelled_class =my_var=my

    value @my_list={ 'list','of','4','values'} =my_array[element]=value @my_array[list]={ 'list','of','4','values'}
  25. #!/bin/sh /bin/egrep -h ^[=@+-] $* 2> /dev/null

  26. #!/usr/bin/perl use strict ; use warnings ; my %class ;

    # classes container my %variable ; # variables container # Silence errors (e.g.: missing files) close STDERR ; while (my $line = <>) { chomp $line ; my ($setting,$id) = ( $line =~ m{^\s*([=\@/+-_])(.+)\s*$} ) ; next if not defined $setting ; # line didn't match the module protocol # add a class if ($setting eq '+') { # $id is a class name, or should be. $class{$id} = 1 ; } # undefine a class if ($setting eq '-') { # $id is a class name, or should be. $class{$id} = -1 ; } # reset the status of a class if ($setting eq '_') { # $id is a class name, or should be. delete $class{$id} if exists $class{$id} ; }
  27. # define a variable/list if ($setting eq '=' or $setting

    eq '@') { # $id is "variable = something", or should be my ($varname) = ( $id =~ m{^(.+?)=} ) ; $variable{$varname} = $line ; } # reset a variable/list if ($setting eq '/') { # $id is "variable = something", or should be delete $variable{$id} if exists $variable{$id} ; } # discard the rest } # print out classes foreach my $classname (keys %class) { print "+$classname\n" if $class{$classname} > 0 ; print "-$classname\n" if $class{$classname} < 0 ; } # print variable/list assignments, the last one wins foreach my $assignment (values %variable) { print "$assignment\n" ; }
  28. general defaults location environment node project location overrides environment general

    defaults general defaults location environment general defaults node
  29. bundle agent enc { vars: any:: [...] ################################################################## # General

    defaults "defaults" string => "$(basedir)/_default_" ; ################################################################## # Select the location (class => file) "location_class[amsterdam]" string => "_amsterdam_" ; "location_class[ashburn]" string => "_ashburn_" ; "location_class[oslo]" string => "_oslo_" ; "location_class[seattle]" string => "_seattle_" ; "location_class[thor]" string => "_thor_" ; "location_class[wroclaw]" string => "_wroclaw_" ; # @(locations) gets all class names mentioned in location_class "locations" slist => getindices("location_class") ; # loop through @(locations), get the right file for this location "location" comment => "Select a path for the location", string => "$(locdir)/$(location_class[$(locations)])", ifvarclass => "$(locations)" ;
  30. classes: any:: "enc_has_location" expression => isvariable("location") ; "enc_has_locenvironment" expression =>

    isvariable("locenvironment") ; "enc_has_project" expression => isvariable("project") ; "enc_has_override" expression => isvariable("override") ; "enc_has_projenv" expression => isvariable("projenv") ; "enc_has_node" expression => isvariable("node") ;
  31. vars: [. . .] ################################################################## # put it all together:

    # initialize the list: "list" comment => "Build the ENC list", policy => "free", slist => { "$(defaults)" } ; enc_has_location:: "list" comment => "Increment ENC list with location", policy => "free", slist => { @(list), "$(location)" } ; enc_has_locenvironment:: "list" comment => "Increment ENC list with per-location environment", policy => "free", slist => { @(list), "$(locenvironment)" } ; [. . .] enc_has_node:: "list" comment => "Increment ENC list with node settings", policy => "free", slist => { @(list), "$(node)" } ;
  32. methods: "ENC" comment => "External node classification", usebundle => henc("enc.list")

    ; vars: ntp_unicast:: "servers" slist => { @(henc.ntp_servers) } ;
  33. bundle agent henc(enclist_name) { vars: henc_has_list:: "enclist" slist => {

    "@($(enclist_name))" } ; "enc_fullpath" slist => maplist("$(site.inputs)/$(this)","enclist") ; "encargs" string => join(" ","enc_fullpath") ; classes: "henc_has_list" expression => isvariable("enclist_name") ; "henc_has_args" expression => isvariable("encargs") ; "henc_can_classify" and => { "henc_has_list","henc_has_args" } ; files: any:: "$(site.lmodules)/henc" comment => "Copy/update hierarchical merger", copy_from => digest_cp("$(site.modules)/henc"), perms => root_executable ; henc_has_list:: "$(site.inputs)/$(enclist)" comment => "Cache henc files locally", copy_from => digest_cp("$(site.masterfiles)/$(enclist)") ; commands: henc_can_classify.!henc_classes_activated:: "$(site.lmodules)/henc" comment => "Hierarchical classification for $(sys.fqhost)", args => "$(encargs)", classes => always("henc_classes_activated"), module => "true" ; }
  34. final take-aways smart classification is crucial... ...but doesn't need to

    be complicated... ...unless your infrastructure is. sometimes, even plain text is good enough!
  35. None
  36. comments? @brontolinux mmarongiu@tiscali.it http://syslog.me http://no.linkedin.com/in/marcomarongiu/

  37. TAKK for meg