Upgrade to Pro — share decks privately, control downloads, hide ads and more …

[DE] SOLUTION STUDY: Automatische Optimierung von Kubernetes Pod Konfiguration durch Machine Learning

Lars Wolff
January 26, 2021

[DE] SOLUTION STUDY: Automatische Optimierung von Kubernetes Pod Konfiguration durch Machine Learning

Ob IoT, Logistik, Gesundheits- und Verwaltungswesen, E-Commerce oder Endkunden-Services: wir alle haben, fernab der digitalen Transformation und des kulturellen Wandels in der IT, mindestens zwei Dinge gemeinsam: Erstens: Geschäftsmodelle sind abhängig von Datenverarbeitung und damit von der IT. Zweitens: Wir betreiben IT in flexiblen, modernen Umgebungen, namentlich Kubernetes und der Cloud.
Die Problemfelder Performance, Skalierbarkeit, Verlässlichkeit und Kosteneffizienz der Infrastruktur beschäftigen uns tagtäglich und können mit solidem Tooling für Last- und Performance-Tests, Kollaboration und Wissensverteilung verstanden und kontrolliert werden.

Doch wie hilft uns dieses erlangte Wissen mit den unzähligen Konfigurationsmöglichkeiten der Infrastruktur umzugehen und diese auch nachhaltig zu optimieren?

In diesem Vortrag erkläre ich, vor welchen Herausforderungen agile Software-Entwicklungen und DevOps-Organisationen stehen, um kontinuierlich über das Verhalten ihrer Infrastruktur-Systeme zu lernen. Zusätzlich zeige ich auf, wie mittels Machine Learning die optimale Konfiguration von Kubernetes Pods automatisch zum jeweiligen Trade-Off von schnellster Performance bis hin zu bestmöglicher Kosteneffizienz bestimmt wird.

Unsere Kunden wie Eurowings, Los Angeles Times, RTL und Vorwerk verbessern Performance, Skalierbarkeit und Verlässlichkeit erfolgreich mit StormForge Performance Testing und StormForge Optimize.

Presentation Page: https://stormforger.com/presentations/2021/ai-infrastructures/
Conference Page: https://www.scale-up-360.com/de/ai-infrastructures
Learn more at: https://www.stormforge.io

Lars Wolff

January 26, 2021
Tweet

More Decks by Lars Wolff

Other Decks in Technology

Transcript

  1. • Founded load & performance testing SaaS StormForger with Sebastian

    Cohnen in 2014 • StormForger merged with Boston based machine learning company Carbon Relay in 2020, rebranded to StormForge • On the mission to help DevOps organizations to challenge most efficient, high-quality delivery at speed • AWS UserGroup Cologne/Germany • @larsvegas / [email protected] • #Agile #LeanProduct #DevOps Lars Wolff
  2. Challenges to Successfully Run Cloud-Native Workloads Performance matters • Speed

    and responsiveness? • Conversion? • Customer satisfaction? Scalability is key for the cloud • Capacity? • Thresholds for scaling? • Service utilization and impact? Reliability is not negotiable • System behavior? • Fault tolerance? • Fail-over and self- healing? (Cost) Efficiency is business crucial • Service configuration efficiency? • Service utilization? • Efficient budget spendings?
  3. Perf Test Early and Often … The “time to first

    test run” is dramatically reduced (<1hr) which leads to increased test coverage and frees up time – Vorwerk https://aws.amazon.com/de/partners/success/vorwerk-stormforger/
  4. Perf Test Early and Often … • Define test cases

    in code and shift-left • Start test runs any time, at any scale, from anywhere in the world on-click or via API • Get extensive reporting & analytics in seconds • Integrate testing in CI/CD and validate non- functional requirements continuously and create quality-gates • Democratize knowledge about behavior and let a sustainable performance culture evolve
  5. … and Optimize Automatically 50% reduction in cost (resource utilization)

    with negligible performance degradation – Large online travel company
  6. … and Optimize Automatically … • Define objectives to optimize

    in configuration files • Trigger experiments with automatic Pod deployments using predicted configuration to test • Retrieve tested candidates of optimal configurations to deploy the optimized candidate to production • Optimize takes care about ML-Infra to focus on real business trade-offs like costs vs. performance • Improve single pods and whole applications
  7. To Successfully Run Cloud-Native Workloads • Create Continuous performance testing

    is a must-have to challenge performance, scalability, reliability and (cost-)efficiency in complex systems • Manual optimization of configuration is extremely time consuming and complicated. It can be automated using machine learning • Collaboration on requirements, democratization of knowledge about behavior and configuration is key to let a sustainable performance culture evolve
  8. Free DevOps from technical and organizational limits Deploy high-quality, scalable

    and reliable systems in very short cycles StormForge is an integral part of our production readiness strategy. Empowering our teams to assess the performance of each release within their continuous delivery pipeline, the team not only understands the risks related to their changes but it also enables us to run capacity planning and sizing for our Kubernetes cluster on AWS all the time. First, we used StormForge to prevent and detect performance issues. Over time, it evolved to the right tool for us to validate modification in architecture or infrastructure before every major change. StormForge is a great help to guarantee high-quality delivery to the users. Cynthia Dematteis-Krug Senior Quality Assurance Engineer, Shop Apotheke Service GmbH Alexander Heusingfeld Head of Digital Architecture & Infrastructure, Vorwerk