Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Trying to scale Puppet with simple tricks

Trying to scale Puppet with simple tricks

Configuration Management Infrastrusture @GRNET. Trying to scale Puppet with simple tricks.

GRNOG 4th technical meeting

Alexandros Afentoulis

December 21, 2016
Tweet

More Decks by Alexandros Afentoulis

Other Decks in Technology

Transcript

  1. Configuration Management Infrastrusture @GRNET or Trying to scale Puppet with

    simple tricks Alexandros Afentoulis [email protected] GRNOG 4th technical meeting
  2. GRNET NOC “servers” operations team: ~okeanos cloud service ViMa virtual

    machines service Blood Donor Registry Eudoxus, Apella Harmoni (hospitals) More - fleet of ~800 machines, ~400 bare metal - Debian GNU/Linux - 5,5 datacenters in 5 locations
  3. Overview of changes in Puppet ecosystem to achieve: - availability

    - easier maintainance - locality awareness
  4. What is Puppet? - Infrastructure as code - An ecosystem(beast)

    on its own - Ensures a state of configuration - Essential tool for managing our services/infra
  5. How does Puppet work? - Describe desired resources & state

    in puppet DSL - Host asks to be “puppetized” - Puppetmaster compiles and serves a catalog - Host eventually converges to desired state
  6. Scaling challenges - How to scale as hosts number grows?

    - How to scale across multiple DCs? - How to minimize failure impact? - What about locality (DC-centric)?
  7. Our ecosystem old setup - Hosts: server=puppet.grnet.gr - multiple puppetmasters,

    different locations - puppet.grnet.gr DNS round robin - was working, but...
  8. Can we do better? - mechanism to smartly manage the

    pool - health checks, add/remove on the fly, stats, more?
  9. A Load Balancer please Hello HAProxy! - 1 instance per

    location, TCP proxy - puppetmasters pool as backend, local preference - health checks (L7), admin socket => actions scriptable, smart auto-reactions, stats - stable, lightweight
  10. peers mybeers peer eie-anyhap 127.0.0.1:1337 frontend puppetfront bind :::8140 v4v6

    acl big_queue avg_queue(puppetworkers) gt 1 use_backend moar_puppetworkers if big_queue acl icinga src 194.177.210.168 62.217.124.126 acl bacula src 83.212.5.68 acl jp src 194.177.211.163 use_backend fast_puppetworkers if icinga or bacula or jp default_backend puppetworkers timeout client 1000s option tcplog backend puppetworkers timeout server 1000s mode tcp balance roundrobin stick-table type ip size 200k expire 15m peers mybeers stick on src option httpchk GET /production/status/koko HTTP/1.1\r\nHost:puppet.grnet.gr\r\nAccept:*/* option srvtcpka option allbackups server p0 p0.grnet.gr:8140 check check-ssl inter 10s fastinter 1s fall 1 weight 50 maxconn 12 backup server p1 p1.grnet.gr:8140 check check-ssl inter 2s fastinter 1s fall 1 weight 50 maxconn 12 server p2 p2.grnet.gr:8140 check check-ssl inter 10s fastinter 1s fall 1 weight 50 maxconn 12 backup server p3 p3.grnet.gr:8140 check check-ssl inter 10s fastinter 1s fall 1 weight 50 maxconn 12 backup server p4 p4.grnet.gr:8140 check check-ssl inter 2s fastinter 1s fall 1 weight 50 maxconn 12 server p5 p5.grnet.gr:8140 check check-ssl inter 2s fastinter 1s fall 1 weight 50 maxconn 12
  11. We done yet? - nicely manage the pool, availability for

    puppetmasters But... - How will hosts know about haproxy frontends? - Shouldn’t they be available as well? - Want an active-active setup, no VRRP - Want locality too
  12. Anycast - Setup anycast service IP - puppet.grnet.gr resolves to

    that - Each HAProxy instance peer with a GRNET’s IP router - Announcements with BGP (bird) - AS for private use, a /32 router id 62.217.126.167; filter bgp_to_grnet { if net.ip = 62.217.126.165 && net.len = 32 then accept; else reject; } protocol bgp BGP_ANY { local as 65500; description "BGP-EIER"; neighbor 62.217.126.166 as 5408; import filter rejectall; export filter bgp_to_grnet; bfd on; }
  13. Resume - puppet as anycast service - load balancers availability,

    locality - manageable pool - availability of puppetworkers - lots more to be done for this ecosystem