Save 37% off PRO during our Black Friday Sale! »

Cloudstack design decisions

Cloudstack design decisions

Cloud operations at scale

2fcc875f98607b3007909fe4be99160d?s=128

Pierre-Yves Ritschard

June 10, 2014
Tweet

Transcript

  1. CLOUDSTACK DESIGN DECISIONS CLOUD OPERATIONS AT SCALE

  2. SHORT BIO Pierre-Yves Ritschard CTO @ exoscale - The safe

    home for your cloud applications Open Source Developer - pithos, cyanite, riemann, collectd, openbsd Architect of several cloud platforms - paper.li Recovering Operations Engineer
  3. Simple and efficient cloud hosting platform Full compatibility with automation

    tools Hosted in a safe jurisdiction
  4. CLOUD BUILDING BLOCKS service infrastructure software people

  5. SERVICE SIMPLICITY AND SCALABILITY Cloudstack based Basic networking Local storage

    KVM hypervisor: SmartOS inspired
  6. CLOUDSTACK Great extensibility, easy to plug into.

  7. BASIC NETWORKING One IP per VM. Security groups are hypervisor

    controlled layer 2 firewall rules. Provides all the flexibility of a traditionnal firewall, completely API controlled.
  8. LOCAL STORAGE Fast I/O, persistent disks.

  9. KVM HYPERVISOR Best in class hypervisor. Diskless and netboot approach.

    Avoids resource waste, facilitates upgrades.
  10. INFRASTRUCTURE THE GOOD CITIZEN CONTRACT Configuration management Visibility Build factory

    Remote execution
  11. THE GOOD CITIZEN CONTRACT new machines have roles role defines

    converged configuration as sum of components each component has an expected normal state and reports it no local intervention needed
  12. CONFIGURATION MANAGEMENT code is a great way to define infrastructure

    ensures homogeneity ability to iterate fast great source of change tracking avoids fear of change
  13. OVER 3000 COMMITS

  14. CONFIGURATION MANAGEMENT: PUPPET battle tested tool simple declarative DSL to

    express configuration fits our component approach well
  15. VISIBILITY FROM THE MAP TO THE TERRITORY logs metrics alerts

  16. WHY FOCUS ON VISIBILITY distributed systems with lots of moving

    parts, high node volatility
  17. LOGS all application and system logs sent over the wire

    logstash disects and extracts metadata elasticsearch indexes for easy retrieval simple correlation
  18. None
  19. METRICS all application and system metrics sent over the wire

    by collectd graphite's carbon aggregates and produces appropriate roll- ups if it moves, graph it. if it doesn't, graph it if it starts moving.
  20. None
  21. ALERTS unbounded stream of log and metric data passive approach

    bodes well with node volatility riemann takes decisions based on stream content ability to extract meaningful information
  22. BUILD FACTORY continuous integration package repositories

  23. CONTINUOUS INTEGRATION over 60 build jobs ties into our code

    hosting platform handled by jenkins
  24. PACKAGE REPOSITORIES generates valid and signed Debian repositories ensures fast

    upgrades simplifies configuration management
  25. REMOTE EXECUTION a simple pubsub system recurrent commands stored as

    scenarios command line, HTTP and IRC interaction
  26. A SIMPLE PUBSUB SYSTEM each node runs an agent responsible

    for carrying out commands. commands are sent to groups of nodes (by predicates such as role).
  27. RECURRENT COMMANDS STORED AS SCENARIOS intricate workflows can be expressed

    through a simple DSL
  28. COMMAND LINE, HTTP AND IRC INTERACTION most of our production

    environment can be controlled through our chatroom
  29. SOFTWARE FILLING IN THE GAPS Customer management Real-time metering and

    billing Integrated console A few other things
  30. CUSTOMER MANAGEMENT Keeping track of our users Support services (ticket

    management, coupons, emails)
  31. REAL-TIME METERING AND BILLING can't be tied to a cloudstack

    only solution cloudstack emits useful data ties into our customer management
  32. INTEGRATED CONSOLE integrated experience across our services hides complexity and

    cloudstack specifics exposes exoscale specific features
  33. None
  34. A FEW OTHER THINGS pithos cyanite fleet collectd add-ons

  35. PEOPLE EFFICIENT WORK. QUIET NIGHTS Small SRE team Avoiding deploy

    anxiety
  36. SMALL SRE TEAM Our platform must be simple to operate,

    additional moving parts must provide business value or help operations
  37. AVOIDING DEPLOY ANXIETY Our software and infrastructure helps ensure we

    have good tools to ensure quiet nights and easily caught errors
  38. LOOKING BACK Cloudstack is a solid foundation for a IAAS

    platform There's a bit more to it than just installing cloudstack Building a sustainable and scalable platform on top of cloudstack is possible
  39. QUESTIONS ?