$30 off During Our Annual Pro Sale. View Details »

CloudInit: The Good Parts

CloudInit: The Good Parts

Cloud-Init is the de facto industry standard for early-stage initialization of virtual machines in the cloud, but few engineers are familiar with everything that it has to offer.

All Linux virtual machines have Cloud-Init in their boot phase, whether they're as small as a t3.nano instance in AWS, as large as a Standard_HB60rs in Azure, or and on-premise OpenStack instance. Originally designed for Ubuntu in EC2, Cloud-Init provides at-first-boot configuration support across most Linux distributions and all major clouds.

Many operators are familiar with supplying a shell script via user-data when provisioning their compute resources, but Cloud-Init has a massive amount of other functionality that is, more often than not, left untapped.

In this talk, Event Store co-founder James Nugent explores that untapped potential by looking through some of the features of Cloud-Init and how to take advantage of them to improve the operability and resilience of your cloud operations.

James Nugent

August 05, 2019
Tweet

More Decks by James Nugent

Other Decks in Technology

Transcript

  1. Cloud-Init: The Good Parts
    James Nugent
    Event Store Ltd
    @jen20
    jen20

    View Slide

  2. Why Cloud-Init?
    • The de-facto industry standard for early-stage initialisation of virtual machines in
    the cloud running Unix-derived operating systems.
    • Used to specialise a generic operating system image at runtime to by provisioning
    a given set of configuration.
    • Originally developed by Canonical for configuring Ubuntu Linux running in
    Amazon EC2.
    • Now prevalent across all major clouds and most Unix-based operating systems.

    View Slide

  3. Why Cloud-Init?

    View Slide

  4. Why Cloud-Init?

    View Slide

  5. Why Cloud-Init?
    • Building a new machine image for each role a virtual machine must play in your
    infrastructure can be costly in terms of time:
    • The cycle of booting a machine, customising it, and imaging can take anywhere
    from 5 minutes to over an hour.
    • For rapidly evolving software, building an image for each version of a program
    increases the amount of time it takes to get that version to production.
    • Correctly constrained, runtime customisation can provide many of the benefits of
    an image-based workflow, with different trade-offs.
    • Image-based workflow can still make use of CloudInit in the image building process!

    View Slide

  6. Configuring Cloud-Init

    View Slide

  7. Configuring Cloud-Init
    • The cloud-init package is installed in the operating system images supplied by
    most clouds. On systemd-based Linux, cloud-init.service usually runs at boot,
    as a oneshot service.
    • When started with the init sub-command, cloud-init runs the commands
    defined in a sequence of modules to specialise the operating system installation for
    the intended purpose.
    • Configuration comes from two sources:
    • Cloud provider-supplied metadata
    • User-supplied configuration

    View Slide

  8. Configuring Cloud-Init

    View Slide

  9. Example
    A Shell Script in User Data

    View Slide

  10. A Shell Script in User Data

    View Slide

  11. A Shell Script in User Data

    View Slide

  12. A Shell Script in User Data

    View Slide

  13. Example
    #cloud-config

    View Slide

  14. #cloud-config

    View Slide

  15. #cloud-config

    View Slide

  16. #cloud-config

    View Slide

  17. #cloud-config Schema
    • #cloud-config is a complex YAML schema, whose valid components are affected
    by which modules are installed.
    • Documentation is somewhat hit-and-miss. Most of the information is in the docs,
    somewhere. All of the information is in the (python) source code of cloud-init.
    • Unless referring constantly to the code, writing the configuration files can be an
    iterative process of trial and error until you have a sufficiently large collection of
    reusable sections which you can cargo-cult into doing what you want.
    • cloud-init(1) has limited built-in schema validation functionality, but most
    modern editors will do just as well here with a YAML plugin.

    View Slide

  18. #cloud-config Schema

    View Slide

  19. #cloud-config Schema

    View Slide

  20. #cloud-config Schema

    View Slide

  21. Example
    Host SSH Keys

    View Slide

  22. Host SSH Keys

    View Slide

  23. Host SSH Keys

    View Slide

  24. Host SSH Keys
    • There is no built-in module for specifying host keys at first boot, so we’ll need to
    build this ourselves.
    • Breaking this task down, we’ll need to do a few different things:
    • Generate some known host keys, and get them to the virtual machine
    • Move the keys into /etc/ssh before the first time the ssh.service unit starts
    • To do this correctly, it’s necessary for us to dig into the cloud-init default
    configuration to see what order modules run in, and choose the correct places to
    insert our logic.

    View Slide

  25. Host SSH Keys
    • Cloud-init runs in three phases:
    • Init - essential configuration that must be done early on
    • Config - configuration that doesn’t affect other stages of boot
    • Final - configuration that must be run as late as possible
    • The configuration (by default) lives in /etc/cloud/cloud.cfg, and is YAML.
    • One of the pieces of configuration sets in cloud.cfg is which modules run in
    which phase.

    View Slide

  26. Host SSH Keys

    View Slide

  27. Host SSH Keys

    View Slide

  28. Host SSH Keys

    View Slide

  29. Host SSH Keys

    View Slide

  30. Host SSH Keys

    View Slide

  31. More about write-files
    • File content needs to be provided embedded in the YAML configuration file.
    • If we wanted to provide them from a remote source, we could use a script to
    download and verify checksums of whatever files we downloaded.
    • Properties we can set for each file are:
    • Content (in one of a variety of encodings)
    • Path
    • Owner
    • Mode

    View Slide

  32. Host SSH Keys

    View Slide

  33. Generating Host Keys

    View Slide

  34. Generating Host Keys

    View Slide

  35. Generating Host Keys

    View Slide

  36. Generating Host Keys

    View Slide

  37. Write-Files Configuration

    View Slide

  38. Write-Files Configuration

    View Slide

  39. Additional Configuration

    View Slide

  40. Multi-part Configuration for
    User-Data

    View Slide

  41. Generating Multi-part Cloud Config

    View Slide

  42. Booting a Virtual Machine

    View Slide

  43. Booting a Virtual Machine

    View Slide

  44. About those docs…

    View Slide

  45. View Slide

  46. View Slide

  47. ssh_keys Configuration

    View Slide

  48. ssh_keys Configuration

    View Slide

  49. Debugging

    View Slide

  50. ssh_keys Configuration

    View Slide

  51. Other Use Cases
    • Some of the use cases we didn’t look at today, but are easy enough to accomplish:
    • Change the filesystem types for attached volumes (e.g. to XFS)
    • Configure a package repository (yum or apt) and install packages at boot
    • Install Docker, pull and run an image from Docker Hub
    • Run Chef or Puppet in standalone mode on boot, after downloading or installing the configuration
    from a package
    • Write out an /etc/machine-role to use as a base for accessing SSM Parameter Store trees
    • Join a node to a Serf or Consul cluster
    • Post a notification to Slack using the “phone home” module

    View Slide

  52. Summary
    • Cloud-init packs in a huge amount of functionality, however it’s not necessarily
    very discoverable.
    • It is worth learning at least the basics if:
    • You want a runtime-specialisation-based workflow
    • You work across a diverse range of clouds or operating systems and want a
    common configuration tool

    View Slide

  53. Thanks!

    View Slide