The Testing – Monitoring Continuum (DevOps Sydney, 2015)

E34acb847338523dc088f03f0eedd1eb?s=47 Rob Howard
January 15, 2015

The Testing – Monitoring Continuum (DevOps Sydney, 2015)

"Testing" and "Monitoring" are often seen as completely separate things. Let's try mixing it up.

(Lightning talk.)

E34acb847338523dc088f03f0eedd1eb?s=128

Rob Howard

January 15, 2015
Tweet

Transcript

  1. The Testing↔Monitoring Continuum

  2. Question Time

  3. Ops Who here uses Nagios? Monit+MMonit? ... Serverspec?

  4. Dev Who writes unit tests? Integration tests, eg. using browser-driving

    full-stack tools like Selenium, Capybara, etc?
  5. What is Monitoring?

  6. You're developing a product; an app or something. Let's say

    we have a bunch of machines running that app.
  7. Load Balancers

  8. Load Balancers App Servers

  9. Load Balancers App Servers Data Store

  10. We're monitoring. What do we do? Well, first we should

    probably make sure that the servers are actually up. Easy!
  11. Well, what about more specific things. Is PostgreSQL running on

    the database? Can we see its PID?
  12. Is Postgres accepting connections?

  13. Is it accepting connections with the right username + password

    for the app? Maybe we stuff up a config rollout.
  14. Okay, but does it have the PG extensions the app

    needs, eg. for UUID generation?
  15. Is the app's database named correctly?

  16. Can the app see the tables it needs in the

    database?
  17. Can it write to those tables? Maybe we screwed up

    the permissions.
  18. THIS IS GETTING A BIT MUCH.

  19. Do we have to do this for every service or

    node that we're running? Where do we stop?
  20. Run the App. Well, maybe the best way of doing

    this is running the app itself. We could write a bash+curl script that, like, tests just logging in.
  21. Run the App's Tests. But is that testing everything the

    app needs to use? Maybe it'll break on the next click. Why not go the whole hog? Our app has an integration test suite (or should have). We spent a lot of money on it!
  22. Story Time Let's say we have a multi-tenant, hosted, Software-as-a-Service

    app that users buy instances/accounts for. VM Hosting, Chat, whatever.
  23. Local Dev. Env We'd have unit tests that you run

    on your local box.
  24. Local Dev. Env But also those big browser-driven tests as

    well. The test runner is still local, against a local copy of your app.
  25. Local Dev. Env Production Staging We have staging and production

    environments too.
  26. Local Dev. Env Production Staging Why don't we: * Spin

    up a new account on staging. * Run the integration tests against that new account. * Throw away the account afterwards.
  27. Local Dev. Env Production Staging It could be a custom

    app kicking off these test runs, but it could easily be Jenkins.
  28. Local Dev. Env Production Staging Do the same for production!

    Have these tests run over and over again. Chew up some of your production capacity, but have greater surety that your app works when placed into the staging and production environments you've configured and rolled out.
  29. Local Dev. Env Production Staging We're testing the
 app+infrastructure interface.

    We're testing that the, say, file upload feature on your chat app actually works with the infrastructure it's relying on.
  30. Local Dev. Env Production Staging It's not super-easy or perfect,

    and testing interactions with external systems (particularly payment ones) is hard, and might just involve turning off parts of your test and instrumenting detection of errors instead.
  31. Local Dev. Env Production Staging And finally, to be clear,

    this isn't replacing your environment tests (eg. available disk/RAM/CPU) or error-rate instrumentation; this is to alleviate the need for a ton of individual fine-grained service checks that would be better tested by an app being hit by your existing test suite.
  32. Testing Monitoring Back to the title. Instead of Testing and

    Monitoring as separate, discrete things, I'd argue that…
  33. Testing Testing
 +
 Monitoring … Testing is a part of

    Good Monitoring.
  34. Fin. 
 Rob Howard
 @damncabbage https://speakerdeck.com/damncabbage/ Thanks! One final thing…

  35. I work at OrionVM and we're hiring; we're building cloud

    hosting (physical) infrastructure, and we're after an Ops person (networking+routing, physical server wrangling, configuration management) and a Ruby/JS dev (UI) to help out.
  36. Fin. 
 Rob Howard
 @damncabbage https://speakerdeck.com/damncabbage/