Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Automatically optimize cloud-native systems for fun and profit

Lars Wolff
September 28, 2021

Automatically optimize cloud-native systems for fun and profit

Automatically optimize cloud-native systems for fun and profit

While running production workloads we are challenged by performance, scalability, reliability and (cost) efficiency sooner or later. It’s necessary to understand the true behavior of the complex, distributed system and to understand how to optimize it – folks say these things didn’t get easier with Kubernetes. 😉

Lars would like to discuss how to understand the complexity of our systems and why integrated load and performance testing is super-awesome. Further he thinks the future is fully automated optimization.

Lars Wolff is co-founder of StormForger, a performance testing SaaS for agile and DevOps teams. In 2020, StormForger and the US company Carbon Relay merged to StormForge (https://www.stormforge.io/), combining load and performance testing with machine learning.
Lars' career runs from product development to consulting and agile coaching to company formation. Today he mainly helps teams to deliver reliable, scalable and fast systems.
He’s in love with DevOps, Scrum, Kanban, Whiteboards and Post-its®.

Lars Wolff

September 28, 2021
Tweet

More Decks by Lars Wolff

Other Decks in Technology

Transcript

  1. • Software Development • Agile Coaching • Founded load &

    performance testing SaaS StormForger with Sebastian Cohnen in 2014 • StormForger merged with Boston based machine learning company Carbon Relay in 2020, rebranded to StormForge • On the mission to help cloud-native organizations to challenge most efficient, high-quality delivery at speed • AWS UserGroup Lead Cologne/Germany • @larsvegas / [email protected] • #Post-it #Agile #Scrum #Kanban #LeanProduct #DevOps Lars Wolff
  2. Challenges to Successfully Run Cloud-Native Workloads Performance matters • Speed

    and responsiveness? • Conversion? • Customer satisfaction? Scalability is key for the cloud • Capacity? • Thresholds for scaling? • Service utilization and impact? Reliability is not negotiable • System behavior? • Fault tolerance? • Fail-over and self- healing? (Cost) Efficiency is business crucial • Service configuration efficiency? • Service utilization? • Efficient budget spendings?
  3. Face the Challenges and … • …gain an understanding of

    the behavior of your system • …create a clear and common ground for the non-functional requirements for yourself, your team AND THE BUSINESS • …gather insights and data to decide for sense-full improvements and optimizations (Spoiler: There are a lot of trade-offs to make)
  4. Perf Test Early and Often The “time to first test

    run” is dramatically reduced (<1hr) which leads to increased test coverage and frees up time – Vorwerk https://aws.amazon.com/de/partners/success/vorwerk-stormforge/
  5. • Define test cases in code and shift-left • Start

    test runs any time, at any scale, from anywhere in the world on-click or via API • Get extensive reporting & analytics in seconds Perf Test Early and Often
  6. Damn! New Problems in Sight! • “Manual testing” is a

    lot of effort and time consuming • State of the test case and test data is probably out of date at the next cycle/manual test session • Most important: The state of the system under test (SuT) will have changed. We all strive to ship fast on a regular basis – for good.
  7. CI/CD Pipeline Integration and Automation • Automate all the things!

    ALWAYS FUN! • Move in a development motion like: • Introduce new feature/endpoint ✅ • Create functional tests ✅ • Update/maintain performance test case 🤷👋👋👋 • Commit, integrate and run tests 👍 • Use our CLI to integrate and validate non-functional requirements continuously
  8. Automatic Optimization to the Rescue! 50% reduction in cost (resource

    utilization) with negligible performance degradation – Large online travel company
  9. Automatic Optimization to the Rescue! • Define objectives to optimize

    in configuration files • Trigger experiments with automatic deployments using a predicted configuration to test • Retrieve results of all tested candidates and decide for the best™ one
  10. Let’s Control Complexity! • I don’t think that anyone should

    spend time on manual testing • I don’t think that anyone should spend time on manual optimization • I think you should focus on engineering and operations while mature but boring technology takes care about the work.
  11. Let’s Control Complexity! • Continuous Performance Testing is a must

    have to face the challenges of performance, scalability, reliability and efficiency • Because of the huge complexity manual optimization is not even possible to be done by humans with a fair amount of effort. But it can be easily solved using machine learning. • Bonus: Democratize knowledge about the systems capabilities and efficiency. Let a sustainable performance culture evolve and strive for operational excellence.
  12. Free DevOps from technical and organizational limits Deploy high-quality, scalable

    and reliable systems in very short cycles StormForge is an integral part of our production readiness strategy. Empowering our teams to assess the performance of each release within their continuous delivery pipeline, the team not only understands the risks related to their changes but it also enables us to run capacity planning and sizing for our Kubernetes cluster on AWS all the time. First, we used StormForge to prevent and detect performance issues. Over time, it evolved to the right tool for us to validate modification in architecture or infrastructure before every major change. StormForge is a great help to guarantee high-quality delivery to the users. Cynthia Dematteis-Krug Senior Quality Assurance Engineer, Shop Apotheke Service GmbH Alexander Heusingfeld Head of Digital Architecture & Infrastructure, Vorwerk