Container Scheduling Without the Hype: Why Bother?

Container Scheduling Without the Hype: Why Bother?

F43919144cdcddd7ba50e46f71667d08?s=128

Tyler L

June 06, 2018
Tweet

Transcript

  1. 2.

    $ whois tylerjl • Infrastructure/software/devops-y things @ Elastic • Lots

    of recent work on dynamic/containerized environments Come talk to me about Arm SBCs (or if you want to yell about the Elastic Puppet modules) ____________________ < angry at computers > -------------------- \ ^__^ \ (oo)\_______ (__)\ )\/\ ||----w | || ||
  2. 3.

    Who is This For? • Why care about container schedulers?

    • What can they offer operations and development? • Real-world achievements made possible by these solutions (or, new ideas for current practitioners)
  3. 4.

    Where We’ve Been ??? /var/log/? tcp:localhost:??? statsd? prometheus? graphite? ...

    Dependent: • Libraries • Packages • Runtime • Distro • etc.
  4. 5.

    Where We Can Go ??? /var/log/? tcp:localhost:??? statsd? prometheus? graphite?

    ... Dependent: • Libraries • Packages • Runtime • Distro • etc. Let’s talk about: • Runtime • Monitoring • Persistence • Services
  5. 6.

    Runtime (Traditional) • Without containers ◦ Don’t even - dependencies

    are separate from code, messy • With just containers ◦ Where are you running them? Cloud instances? ◦ How are you scheduling and running them?
  6. 7.

    Runtime --- image: org/app:1.0 env: FOO: bar count: 3 •

    Nodes are cattle • Contract w/consumers is clear: ◦ Build instructions ◦ Runtime instructions FROM python:3 COPY app.py app.py CMD python app.py
  7. 8.

    Runtime --- image: org/app:1.0 env: FOO: bar count: 3 •

    Nodes are cattle • Contract w/consumers is clear: ◦ Build instructions ◦ Runtime instructions FROM python:3 COPY app.py app.py CMD python app.py • Deployments are always the same bits - repeatability • Updates are hands-off for both dev and ops - rolling container upgrades • Application changes async from backend (container build instructions)
  8. 9.

    Monitoring (Traditional) Logs • Format? Path? • Opt-in • Accessibility?

    Metrics • System metrics != app metrics • Scrape from app? Alerts • Metrics are good; deployment statistics as well?
  9. 11.

    Monitoring stdout stderr • Zero-config for generic logs/metrics out of

    the box • Easily build custom tools atop this data for out of the box alerting as well • Logs/metrics become self-service with appropriate visualization solutions
  10. 12.

    Persistence (Traditional) • Shared mass storage (ceph, gluster) in traditional

    setups • Dynamically attached storage in the case of cloud environments (EBS) • Works, but: ◦ What ties them together, provisions them, migrates them, backs up? big ol’ data ?
  11. 13.

    Persistence --- volume: size: 50G • Like runtime definitions, the

    underlying impl. Isn’t a concern • Carve off a hunk of storage as needed • Scheduling is happening all the time, storage follows big ol’ data
  12. 14.

    Persistence --- volume: size: 50G • Like runtime definitions, the

    underlying impl. Isn’t a concern • Carve off a hunk of storage as needed • Scheduling is happening all the time, storage follows big ol’ data • Nobody cares where or what the persistence base is, we just have space now • Infra can develop tools to enhance storage for everyone (automated backups, snapshotting, etc.) • Backend-agnostic - GCP, AWS, Azure, etc.
  13. 15.

    Services (Traditional) • Both internal and external: ◦ Spin up

    an app, add it to a pool of servers ◦ Health checks sometimes ◦ Typically, the “expose this” process very loosely coupled with “provision this”
  14. 16.

    • Tie service endpoints to groups of containers and let

    the router/proxy handle it for you Services pods
  15. 17.

    • Load balancers become a by-product of naturally selecting endpoints

    from a pool of healthy endpoints Services pods
  16. 18.

    Services (+MORE) Traefik/Envoy/Fabio are solving neat problems: • Automatic Let’s

    Encrypt TLS • Automatic Host/app name routing • Networking ACLs
  17. 19.

    Better Processes Runtime Monitoring Persistence Services • Contracts are clear

    - no one needs to learn another team’s tools if they don’t want to • Improvements and iteration are completely unblocked on either side • Infra tooling becomes immediately useful for everyone on the platform
  18. 20.

    Thank you! github.com/tylerjl irc/twitter: leothrix tjll.net Additional Information: • Google

    for: ◦ Kubernetes ◦ Nomad ◦ Mesos ◦ Traefik ◦ Envoy Let’s talk about monitoring/metrics at the Elastic booth