Slide 1

Slide 1 text

Verónica López Sr. Software Engineer @ CoreOS @maria_fibonacci FOSDEM 2018 Testing and Automation in the era of Containers

Slide 2

Slide 2 text

● Sr. Software Engineer @ CoreOS ● Automating RHEL ● Now: Distributed Systems Before: Scientific Computing, Mobile ● Common path: Linux whoami

Slide 3

Slide 3 text

- How we do it @ CoreOS -Strategy - What I’ve learned - Change of Paradigm - Go is great for tooling - Closing thoughts Agenda

Slide 4

Slide 4 text

- Every team/dev takes care of their own tests - Don’t merge ‘till all tests pass/get fixed - Strict Release Automation guidelines - Cool for accountability, also because products are different How we do it

Slide 5

Slide 5 text

- Some teams need/want a tougher strategy, like Etcd - +100k lines of code. +60k are tests. - Unit, integration, migration, e2e, regression, stress, functional, benchmarks. - We also use Bazel... How we do it

Slide 6

Slide 6 text

- Testing and Automation teams works on building targeted tools, not on QA. - Relatively new team: still evolving and learning (LOL) - Framework: no more. How we do it

Slide 7

Slide 7 text

- No formal experience in testing, but experience with many distributed systems: fresh look - Focus on test coverage: necessary for baseline, not enough - Incomplete e2e scenarios. Empathy necessary. - Engineers’ backgrounds make a difference What I’ve learnt

Slide 8

Slide 8 text

- My experience: Large systems in Latin America. All of them live or die by ruthless optimization of all the parts. - Privileged/bleeding edge cultures: can afford overengineering and rewrites (skilled people) - Would be great to have the best of both worlds What I’ve learnt

Slide 9

Slide 9 text

- Testing distributed systems is hard. New considerations necessary. - Formal verification of a distsys: hard and expensive. - All the perspectives at once: combinatorial possibilities. - Containers, service meshes, K8s: solve many problems, are fault tolerant, but need different levels of specialized skills. Change of Paradigm

Slide 10

Slide 10 text

1. Using containers for testing 2. How to test containers Testing & Containers

Slide 11

Slide 11 text

Package your test suite, and make your system run against it. Benefits: practical, neat, fast, secure, scalability, portability. Sets a standard for distributed components. One of the developing strategies @CoreOS. Using containers for testing

Slide 12

Slide 12 text

- K8s e2e test suite: “It is not uncommon that a minor change may pass all unit and integration tests, but cause unforeseen changes at system level” Testing Containers...

Slide 13

Slide 13 text

- E2e: by definition, as comprehensive as possible - In order to do that = deep knowledge of the system - The fact that we don’t have to worry about fixing things anymore doesn’t mean that we don’t have to know how they work, and how they fail. - Familiarity with clusters, pods, OS, etc… lifecycles. Testing Containers...

Slide 14

Slide 14 text

- Unit testing: always important - Two outcomes: either incomplete, too complicated - Massive integration tests: antipattern - Based on mocks: only 3 nodes to fail, but what matters is the number of inputs Testing Containers...

Slide 15

Slide 15 text

Monitoring and Support teams must not act as your systems’ nanny! Testing Containers...

Slide 16

Slide 16 text

Writing this talk: scary similarities. Many people experiencing the same problems and perspectives. Working on different teams. Cindy @copyconstruct likes to write about it http://bit.ly/2meWzaF Testing Containers...

Slide 17

Slide 17 text

“Simple Testing Can Prevent Most Critical Failures: An Analysis of Production Failures in Distributed Data-Intensive Systems” - 92% of the catastrophic system failures are the result of incorrect handling of non-fatal errors - Go: error handling, cover Testing Containers...

Slide 18

Slide 18 text

- Go was created with tooling in mind. There’s an article about it https://talks.golang.org/2012/splash.article#TOC_17. Go Tooling

Slide 19

Slide 19 text

- Go cover https://blog.golang.org/cover - Go test - Go error handling https://blog.golang.org/errors-are-values - Errors are Values https://blog.golang.org/errors-are-values I recommend those blog posts. I also recommend Kelsey’s Kubernetes the Hard Way Go Tooling

Slide 20

Slide 20 text

Thank you! Verónica López @maria_fibonacci We’re hiring!