Slide 1

Slide 1 text

Azure Bootcamp – 16.05.2024 From Zero to IaC & AKS

Slide 2

Slide 2 text

Jędrzej Lisowski Andreea Oltean

Slide 3

Slide 3 text

From Decentralized to Centralized A story about a team which grew up over night Chapter 1 How it works Tech Stack The New Way of Working Processes & Standards Chapter 2 Stats for geeks Numbers Numbers Numbers Chapter 3 Let’s get serious AKS Deep Dive Chapter 4 Conclusions Lessons Learnt and Future Plans Chapter 5 1 2 3 4 5 A journey to Infrastructure as Code and AKS

Slide 4

Slide 4 text

From Decentralized to Centralized A story about a team who grew up over night Chapter 1 1 2 3 4 5 A journey to Infrastructure as Code and AKS

Slide 5

Slide 5 text

From Decentralized to Centralized 3 members Cloud Security, Governance and Compliance Cloud Center of Excellence Solutions designed for each stream, not meant to be re-used Limited reusability Different languages and technologies were used for the same objectives Multiple technologies Every AKS cluster was deployed in different ways Different implementations Each stream had their own IT teams and standards Own IT Teams Collaboration was challenging among streams Limited Knowledge Sharing

Slide 6

Slide 6 text

From Decentralized to Centralized Cloud Center of Excellence has 10 members (DevOps & DBOps) Compliance, governance, standards, operations and engineering … for the entire Swiss Life public cloud landscape Build a self-service approach for provisioning cloud infrastructure Everything is open, contribution is welcome

Slide 7

Slide 7 text

No content

Slide 8

Slide 8 text

A journey to Infrastructure as Code and AKS From Decentralized to Centralized A story about a team who grew up over night Chapter 1 1 2 3 4 5 How it works Tech Stack The New Way of Working Processes & Standards Chapter 2

Slide 9

Slide 9 text

Microsoft Azure Primary cloud platform that hosts workloads across compute, database and supporting services Tech Stack How it works Tech Stack Terraform The main Infrastructure as Code language used to manage Azure resources Azure DevOps Code is stored in Azure Repos Deployments are handled with Azure Pipelines GitHub Co-Pilot A friend in need VS Code Pets Extension

Slide 10

Slide 10 text

How it works The New Way of Working Live Infrastructure Terraform code mapping of real Azure resources using modules Live Infrastructure Template Base repository with examples, templates, standards and guidelines for to understand and create live infrastructure Automatic bootstrapping The process that relies on the Live Infrastructure Template to automatically create the skeleton of a live infrastructure repository Terraform Modules Abstract packages for single scoped deployments (Azure Storage Account, Landing Zone Networking etc.) Terraform Module Template Base repository with examples, templates, standards and guidelines to understand and create modules Multi-repo approach Each Terraform Module lies in its own repository

Slide 11

Slide 11 text

How it works Development Process Trunking & Versioning • Trunk-based development with short-lived branches • Squash commits only to master • Tags according to the semantic versioning system Conventions & Standards • Feature branches and repositories naming convention • Specific names and structure for files and local Terraform resources CHANGELOG & README • CHANGELOG.md: The summary of changes, work item link, date and tag version • README.md: Technical details and deviation from standards

Slide 12

Slide 12 text

No content

Slide 13

Slide 13 text

No content

Slide 14

Slide 14 text

No content

Slide 15

Slide 15 text

How it works Development Process Trunking & Versioning • Trunk-based development with short-lived branches • Squash commits only to master • Tags according to the semantic versioning system Conventions & Standards • Feature branches and repositories naming convention • Specific names and structure for files and local Terraform resources Granular Permissions • Each department can contribute to Terraform Modules and their Live Infrastructure repos & pipelines • Reader SPN for terraform plan • Contributor SPN for terraform apply Remote State Files • Each workload is stored in a blob container deployed in the target Azure Subscription • Backend SPN for access to state files Pull Requests • PR templates • Change version tags • Build validations for end-to- end testing • Security involved CHANGELOG & README • CHANGELOG.md: The summary of changes, work item link, date and tag version • README.md: Technical details and deviation from standards

Slide 16

Slide 16 text

How it works Deployment Process Terraform Plan & Apply Terraform Plan Terraform Plan Push Terraform Remotely D Q Pull Request Merge to Master Tag the new version Terraform Apply Terraform Apply Terraform Apply Q P D PR Pipeline Deployment Pipeline P auto

Slide 17

Slide 17 text

From Decentralized to Centralized A story about a team which grew up over night Chapter 1 How it works Tech Stack The New Way of Working Processes & Standards Chapter 2 Stats for geeks Numbers Numbers Numbers Chapter 3 1 2 3 4 5 A journey to Infrastructure as Code and AKS

Slide 18

Slide 18 text

4500 Pipeline runs 500 Pull Requests Stats for geeks 70 Subscriptions > 100 Repositories 60000 Lines of code > 200 Pipelines 6000 Azure Services

Slide 19

Slide 19 text

From Decentralized to Centralized A story about a team which grew up over night Chapter 1 How it works Tech stack The New Way of Working Processes & Standards Chapter 2 Stats for geeks Numbers Numbers Numbers Chapter 3 Let’s get serious AKS Deep Dive Chapter 4 1 2 3 4 5 A journey to Infrastructure as Code and AKS

Slide 20

Slide 20 text

Chapter 4 – Microsoft Reference architecture for AKS AKS Architecture

Slide 21

Slide 21 text

Chapter 4 – Swiss Life reference architecture for AKS AKS Architecture

Slide 22

Slide 22 text

Simplified deployments and operations One reusable end to end solution contained within the AKS Terraform Module Stateless workloads Stateless workloads can easily survive cluster delete and recreation operations Separate AKS clusters for non-standards 3rd part software or deployments which require further isolation are deployed on separate AKS clusters Linux Containers only All system and user node pools are built on Linux One internal shared AKS Cluster Our offer includes a single AKS cluster for all applications across the organization with isolation achieved with namespaces and network policies. Further isolation achieved with separate node pools GitOps Deployment Model A standard way to deploy infrastructure and applications using ArgoCD AKS Environment Setup

Slide 23

Slide 23 text

------------------------------------------------------------------------------- Language files blank comment code ------------------------------------------------------------------------------- HCL 47 76 27 2120 YAML 15 10 2 737 Markdown 6 77 6 726 ------------------------------------------------------------------------------- SUM: 68 163 35 3583 ------------------------------------------------------------------------------- AKS Environment Terraform AKS Module

Slide 24

Slide 24 text

AKS Environment Terraform AKS Live Infrastructure ------------------------------------------------------------------------------- Language files blank comment code ------------------------------------------------------------------------------- HCL 29 17 27 655 YAML 2 16 9 210 Markdown 1 1 0 5 ------------------------------------------------------------------------------- SUM: 32 34 36 870 -------------------------------------------------------------------------------

Slide 25

Slide 25 text

Entra ID Authenitcation for cluster RBAC but also Workload identities to authorize pods with external services Fully Private Cluster For non-production clusters, SPOT instances are being used Saving costs with Spot instances The cluster is always ZRS Minimum 3 nodes in a pool “Stateless” Nodes Nodes with Ephemeral drives Monitoring the cluster with built-in Container Insights Monitoring with Container Insights Calico Network Policy Plugin Azure CNI Overlay AKS Deep Dive Cluster Setup

Slide 26

Slide 26 text

AKS Deep Dive Let’s go deeper

Slide 27

Slide 27 text

AKS Deep Dive Let’s go deeper ArgoCD Optional, enabled by default kured Required cert-manager Optional, disabled by default Certificates Required kyverno Optional, enabled by default Default Network Policy Optional, enabled by default sealed-secrets Optional, enabled by default ingress-nginx Required storage-class Required keda Optional, enabled by default velero Required

Slide 28

Slide 28 text

AKS Deep Dive Application deployment process • ArgoCD monitors state repository • ArgoCD initiates deployment of current state • Application Release trigger with tag • Build Server builds the code und publishes the image • Encrypt Configuration • Publish values-app.yaml and application- definition.yaml in into ArgoCD state repository • Deployment of Namespaces, Managed Identity and Default Network Policy • Additional Resources • Review and Approval of Infrastructure changes by CCoE • Publish values-infra.yaml into ArgoCD state repo Terraform Workflow Developer Workflow ArgoCD Workflow

Slide 29

Slide 29 text

From Decentralized to Centralized A story about a team which grew up over night Chapter 1 How it works Tech Stack The New Way of Working Processes & Standards Chapter 2 Stats for geeks Numbers Numbers Numbers Chapter 3 Let’s get serious AKS Deep Dive Chapter 4 Conclusions Lessons Learnt and Future Plans Chapter 5 1 2 3 4 5 A journey to Infrastructure as Code and AKS

Slide 30

Slide 30 text

Conclusions

Slide 31

Slide 31 text

No solution fits them all Standards are efficient but hard to maintain Vicious cycle – Ops vs Engineering Do not underestimate the amount of time you will spend on Development GitHub Co-Pilot is very helpful, don’t trust it Do not blindly follow the recommendations Conclusions Lessons Learnt

Slide 32

Slide 32 text

Automatic Terraform Modules updates Automatic testing with terraform test Conclusions Future Plans Designing solutions AKS Enhancements Have whole Infrastructure landscape coded

Slide 33

Slide 33 text

No content

Slide 34

Slide 34 text

We enable people to lead a self-determined life.