Upgrade to Pro — share decks privately, control downloads, hide ads and more …

AKS-Series 3 [Video]: Migration from Azure App Service to AKS by Marc Merzinger

AKS-Series 3 [Video]: Migration from Azure App Service to AKS by Marc Merzinger

Stream Link: https://www.youtube.com/watch?v=FtY6Hgkpa8g

Meetup Link: https://www.meetup.com/de-DE/Microsoft-Azure-Zurich-User-Group/events/282513479/

During our project we reached the flexibility limits of Azure App Service and decided to move to Azure Kubernetes Service (AKS). In this talk I will present our learnings in terms of changes to our CI/CD setup, the architecture, the development and talk about knowledge transfer and responsibilities.

Bio:
I am Marc, a Software Engineer living in Switzerland. I am employed at ti&m and part of the cloud team. My area of expertise is around containers, Kubernetes, Azure and GCP.

Links:
https://acloudjourney.io

Azure Zurich User Group

February 17, 2022
Tweet

More Decks by Azure Zurich User Group

Other Decks in Technology

Transcript

  1. Migration from Azure App Service to AKS Marc Merzinger, Senior

    Software Engineer ti&m cloud Zurich, February 2022
  2. Personal Blog: https://acloudjourney.io - Part of the Cloud Engineering Team

    - Primary clouds are Azure and GCP - Still a software engineer but most of the time involved with cloud infrastructure and automation Senior Software Engineer @ ti&m May I introduce myself
  3. Introduction to the application and runtime environment Our migration approach

    Challenges Pros and Cons of the migration Things I would do differently the next time Things to keep in mind for the next time 15.02.22 3 Agenda
  4. 15.02.22 4 - .NET Core 3.1 based backend - React

    based frontend - DDD as architectural flavor - CQRS pattern for scalability - Event Sourcing for auditability by desgin - 7 executeables (+ 1 for initialization jobs) The application Migration from Azure App Service to AKS Command API Software Load Balancer Doc DB Query API Projection Handler SQL DB MSG Bus User Frontend
  5. 15.02.22 5 - One App Service Plan and one App

    Service App per executable - State is externalized via Azure services, application internals completely stateless - Secrets are stored in a KeyVault The previous runtime environment Migration from Azure App Service to AKS Frontend Software Load Balancer Projection Handler Query API Jobs Persistence Ops Secret Management Compute Command API
  6. Why we wanted to migrate to AKS 15.02.22 6 -

    Network isolation with multiple App Service Apps is difficult or extremely increase cost when solving the issue with an isolated SKU (before it was changed) - Each App Service App resides under its own domain (custom domains possible) - but we don’t want to have internals publicly available - and some executables are not web apps - Complexity would carry on with additional bounded contexts Migration from Azure App Service to AKS
  7. Introduction to the application and runtime environment Our migration approach

    Challenges Pros and Cons of the migration Things I would do differently the next time Things to keep in mind for the next time 15.02.22 7 Agenda
  8. Our migration approach 15.02.22 8 - Create Dockerfiles and Helm

    charts with Visual Studio - Extend existing CI/CD pipelines in Azure DevOps - Extend existing App Service Azure Resource Manager templates with AKS resources and remove App Service specificas - Transfer knowledge from me to two team members that are mainly involved with infra and CI/CD – let the remainder concentrate on features Migration from Azure App Service to AKS
  9. 15.02.22 9 - AKS as CaaS for all executables -

    NGINX as Ingress Controller - Kured for automated node restarts - Cert-Manager for certificate management - State is externalized via Azure services, application internals completely stateless - Isolate access to persistence services via Azure network - Let executables use Azure Managed Identities to access Azure Services - Secrets are stored in a KeyVault - B2C as identity solution The planned runtime environment Migration from Azure App Service to AKS APP Persistence Ops Secret Management AKS Networking Command API, Projection Handler, Query API, Frontend Jobs Ingress for Software LB Ingress … Cert-Manager Kured … * Pods, Replica Sets etc. are omitted due to simplicity. Kubernetes icons: https://github.com/kubernetes/community/tree/master/icons
  10. Extend the Azure Resource Manager Templates 15.02.22 10 - Multiple

    Azure Resource Manager Templates were used to deploy the App Service environment - We extended these with AKS templates and added a separate job to install basic cluster components Migration from Azure App Service to AKS Infra CD* Release trigger Infra CI Template Tests ARM Deploy Upgrade NGINX Ingress Upgrade Kured Upgrade Cert- Manager * Simplified, hardening or additional config such as cert-issuer setup were done in this pipeline as well
  11. 15.02.22 11 Migration from Azure App Service to AKS Containerize

    executeables - Visual Studio provides support for autogeneration of the Dockerfile - Considers required .NET (Core) version and cross project dependencies - Provides multi-stage Dockerfiles - Ready for development and deployment But… - It relys on buster-slim for release images. They can contain many vulnerabilities, which we verified with Trivy. - Statically references projects – requires re-generation of Dockerfile each time a project reference changes https://github.com/aquasecurity/trivy
  12. Generate charts for each executable 15.02.22 12 Visual Studio provides

    support for autogeneration of Helm charts - Which does basically a ”helm init” - Deployments, Ingress, Service, Service Account etc. available with default values Weak default security posture - No rootless, no read-only root fs, no drop capabilities and so on - We used checkov to implement best practices - Added manually the horizontal pod autoscaler Migration from Azure App Service to AKS https://github.com/bridgecrewio/checkov
  13. CI setup with Azure DevOps 15.02.22 13 Migration from Azure

    App Service to AKS https://github.com/aquasecurity/trivy and https://github.com/bridgecrewio/checkov * Similar setup for the frontend Container* Git Repo Push, PR trigger Build Test Release Scan Push Chart* Scan Package Publish Push, PR trigger Azure Container Registry
  14. 15.02.22 14 - Backend and frontend are released separately -

    Backend is always deployed completely (all executables) - Using –wait flag to check if deployment was successful (works partially) - Basic cluster components (Ingress Controller, Cert-Manager etc.) are deployed as part of the infrastructure pipeline CD setup with Azure DevOps Migration from Azure App Service to AKS Dev Stage Upgrade Command API Container CI Chart CI Upgrade Projection Handler Upgrade Query API Upgrade Jobs Pipeline release finished trigger
  15. Transfer knowledge to the team 15.02.22 15 - Joined the

    team for approx. 4 weeks (part time) - Implement the heavy lifting tasks (infra design, automation, CI/CD adaptation) - Regular sync with the architect and colleagues that maintain CI/CD and infra. Transfer of work results as well as knowledge - Colleagues implemented enhancements (whole process of Dockerfile, Helm charts, CI/CD adaptation etc.) for the jobs themselves - Left the team but stayed in standby for questions or enhancements – roughly once a month in sync Migration from Azure App Service to AKS PO (Customer) Scrum Master FE Dev FE Dev BE Arch/Dev BE Dev Me
  16. 15.02.22 16 - Log Analytics agent runs as a DaemonSet

    on all nodes - Complete environment and cluster logs are stored in Log Analytics - Integration of Log Analytics is simple and easy to use - Be aware of audit logs (especially if you include reads) Log Based Observability Migration from Azure App Service to AKS APP Ops AKS Ingress … Cert-Manager Kured … * Pods, Replica Sets etc. are omitted due to simplicity. Kubernetes icons: https://github.com/kubernetes/community/tree/master/icons Dashboards Log Analytics Team Alerts Observe Other Azure Services
  17. 15.02.22 17 - Azure Policy used for the whole Azure

    environment, including AKS - Works on top of Gatekeeper and OPA - Mapping from Azure Policies into Policies for Kubernetes resources - Standard Baseline Initiative is applied to the whole cluster - Default Policies are very useful for standard applications - Deny privileged container - Deny specific capabilities Avoid inconsistencies inside the cluster Migration from Azure App Service to AKS Azure Policy Policies: https://docs.microsoft.com/en-us/azure/aks/policy-reference Kubernetes icons: https://github.com/kubernetes/community/tree/master/icons Standard Baseline Initiative AKS kube-system Policy Addon Controller APP Application Sync Enforce Use
  18. 15.02.22 18 - Traffic from external is only allowed for

    the ingress - Ingress can only communicate to the app namespace - Traffic from the app to persistence (etc.) uses VNET integration - L4 Network Policies in K8s are quite restricted (implemented by Azure CNI) Resource Isolation Migration from Azure App Service to AKS Kubernetes icons: https://github.com/kubernetes/community/tree/master/icons APP AKS Ingress … Persistence Secret Management Users VNET Integration
  19. 15.02.22 19 Migration from Azure App Service to AKS Kubernetes

    icons: https://github.com/kubernetes/community/tree/master/icons AKS Ingress NGINX Ingress Controller APP Application cert-manager Cert-Manager Let’s Encrypt - Cert-Manager in combination with Let’s Encrypt - Two domains (dev/test and one prod) - Almost no-operations effort - Upgrades of Cert-Manager need attention Certificate Management api.example.com www.example.com Use secret Store cert in secret ACME (http01)
  20. Introduction to the application and runtime environment Our migration approach

    Challenges Pros and Cons of the migration Things I would do differently the next time Things to keep in mind for the next time 15.02.22 20 Agenda
  21. Challenge: Efficient Resource Usage 15.02.22 21 - One App Service

    Plan hosts multiple instances without overhead - AKS and our environment produces a big overhead - AKS has per default one node pool that hosts a few cluster components (DNS, Log Agents, Tunnelfront and more) - We require even more components for our environment (Cert-Manager, Kured, NGINX, Falco, Pod Identity, …) - Development and integration environment should cost as less as possible - Solution: Use low resource requests for these environments and limit cluster autoscaler to 2 nodes max. Migration from Azure App Service to AKS
  22. 15.02.22 22 Migration from Azure App Service to AKS Challenge:

    Base image vulnerabilities - The default base image (buster-slim) used in the generated Dockerfile contains too many vulnerabilities - We switched to alpine… - And had 0 vulnerabilities, but we broke our application - Obviously alpine images do not have cultures installed Solution: Modify the Dockerfile and install cultures https://andrewlock.net/dotnet-core-docker-and-cultures-solving-culture-issues-porting-a-net-core-app-from-windows-to-linux/
  23. (Small) Challenge: Read-Only RootFS 15.02.22 23 - The backend apps

    are based on the latest version of ASP.NET Core 3.1 - No dependencies to the filesystem (at least we thought so) è After enabling read-only rootfs file uploads failed - .NET Core uses /tmp to buffer large files - Solution: Use ephemeral volumes for the /tmp directories with disc as underlying storage technology – why? We are limited in memory. Migration from Azure App Service to AKS
  24. 15.02.22 24 Growing Demand for Resources Migration from Azure App

    Service to AKS - Sizing for operations has changes resource requirements - More tools required during the project - K6 testing has shown some limits - Started with Standard_DS2_v2 SKU (2 CPU, 8GB RAM) - Ended up with Standard_d4s_v3 SKU (4 CPU, 16GB RAM) Log Analytics Agent Core DNS (Azure) CNI Azure Defender for Containers Falco OPA/Gatekeeper for Azure Policy AKS Baseline Tools Additional Tools we planned to install Tools that became necessary over time And we have not yet talked about the application Cert-Manager Pod Identity … kured
  25. Introduction to the application and runtime environment Our migration approach

    Challenges Pros and Cons of the migration Things I would do differently the next time Things to keep in mind for the next time 15.02.22 25 Agenda
  26. Pros and Cons of the migration 15.02.22 26 Migration from

    Azure App Service to AKS Pros Cons Greater flexibility in all matters Almost everything is «do it on your own» In-sourcing of initialization and clean-up jobs – one API to solve many problems Added responsibilities and need for knowledge is high Allows the application to grow consistently without adding much complexity Low support by the IDE – at least for our way of development
  27. Introduction to the application and runtime environment Our migration approach

    Challenges Pros and Cons of the migration Things I would do differently the next time Things to keep in mind for the next time 15.02.22 27 Agenda
  28. Things I would do differently the next time 15.02.22 28

    Do the security related stuff as early as possible - Saves time in the long run - Avoid issues with security settings (read-only filesystem etc.) Map a bounded context with N services (because of architectural style) to one Helm chart, because - They are mostly deployed together - Simpler CI/CD setup - Centralization of common resources (base network policies etc.) - … or use Helmfile - Split Helm charts and app source code into separate repositories - Split basic cluster components pipeline from the infrastructure pipeline - Use GitOps, depending on the team Migration from Azure App Service to AKS
  29. Introduction to the application and runtime environment Our migration approach

    Challenges Pros and Cons of the migration Things I would do differently the next time Things to keep in mind for the next time 15.02.22 29 Agenda
  30. Things to keep in mind when migrating to Kubernetes 15.02.22

    30 - Much more responsibilities as with PaaS services like App Service - Plan with extra work on cluster management tasks - Plan with extra work on networking tasks - You must cope with more components to update (ingress controller, cert-manager, kured, the application and what you install on the cluster) – carefully consider version dependencies - The team needs to know what it can do with K8s (e.g. jobs) - We had to run cyclic jobs, which were initially realized with Azure Functions. As we moved to K8s we could solve this with K8s concepts as well. - Automate as much as possible – it simplified our work Migration from Azure App Service to AKS
  31. Surf the wave of success with us Are you a

    cloud expert? We’re looking for you! ti8m.com/en/career We are hiring!
  32. ti8m.com ti&m AG Zurich Buckhauserstrasse 24 CH-8048 Zurich +41 44

    497 75 00 ti&m AG Bern Monbijoustrasse 68 CH-3007 Bern +41 44 497 75 00 ti&m GmbH Frankfurt am Main Schaumainkai 91 D-60596 Frankfurt am Main +49 69 247 5584 20 ti&m Pte. Ltd. 18 Robinson Road #15-16 Singapore 048547 Singapore +65 6955 7755 We digitalize your company.