Upgrade to Pro — share decks privately, control downloads, hide ads and more …

2019 Softonic Development Flow

2019 Softonic Development Flow

How we develop in Softonic and move workloads to production.

We talk about our CI/CD pipeline with use cases, tools, secret management, testing environments.

Some figures: number of deploys, avg time per deploy, etc.

Avatar for Softonic

Softonic

May 29, 2019
Tweet

More Decks by Softonic

Other Decks in Technology

Transcript

  1. Who are we? • 20 years old Internet property •

    Mainly a download portal • Translated to many languages: EN, DE, ES, IT... => 20! (includes exotic languages like Vietnamese)
  2. Who are we? • 20 years old Internet property •

    Mainly a download portal • Translated to many languages: EN, DE, ES, IT... => 20! (includes exotic languages like Vietnamese) • 4M daily visits, 12M daily page views • 10K docs/s written in logs in peak time on our services
  3. Who are we? Basilio Vera (@basi) • Senior Principal Software

    Engineer • 15 years working in Softonic • Master of none
  4. Softonic Development Flow A. Production infrastructure B. Development Flow &

    Tools C. CI/CD explained with examples D. Some numbers
  5. Softonic Production Infrastructure We came from a Monolithic application deployed

    in bare-metal datacenter We have migrated “everything” to SOA and all our workloads run in a cloud provider: Google Cloud
  6. Softonic Production Infrastructure We came from a Monolithic application deployed

    in bare-metal datacenter We have migrated “everything” to SOA and all our workloads run in a cloud provider: Google Cloud Google Kubernetes Engine (GKE) running in different regions and zones
  7. Softonic Production Infrastructure We came from a Monolithic application deployed

    in bare-metal datacenter We have migrated “everything” to SOA and all our workloads run in a cloud provider: Google Cloud Google Kubernetes Engine (GKE) running in different regions and zones We use Elasticstack for logs processing (and runtime database!)
  8. Softonic Production Infrastructure We came from a Monolithic application deployed

    in bare-metal datacenter We have migrated “everything” to SOA and all our workloads run in a cloud provider: Google Cloud Google Kubernetes Engine (GKE) running in different regions and zones We use Elasticstack for logs processing (and runtime database!) Prometheus+Grafana for monitoring
  9. Legacy Architecture Binary Providers N Servers N Servers Users EN

    Users ES DB EN Master DB EN Replica DB EN N Replicas SDC CMS SADS DB ES Master DB ES Replica DB ES N Replicas … … Users BCN Download API PHP
  10. Cloud Architecture - Main datacenter Binary Providers Users CMS Noodle

    Web Download API PHP Affiliation CMS Users Data Comments Data Apisoba Hapi User Rating API PHP CloudSQL Developer API PHP CloudSQL MAGNET CloudSQL Experiments API Hapi SDH CMS Autocat API Internal Users External Users Internal Users Catalog API PHP CloudSQL Categories API Hapi Noodle Setup API PHP CloudSQL Affiliation API PHP CloudSQL
  11. Cloud Architecture - Other datacenters Binary Providers Users Download API

    PHP Users Data Comments Data Apisoba Hapi Affiliation API PHP Cloud SQL Noodle Web
  12. Cloud Multidatacenter Users Traffic flow USA users JP users FR

    users Google Load Balancer Main Datacenter Europe Datacenter Asia Southeast Datacenter USA West
  13. • SOA Architecture allows to have smaller independent pieces •

    Each project uses its own source code repository based in Git Softonic Development flow
  14. • SOA Architecture allows to have smaller independent pieces •

    Each project uses its own source code repository based in Git • Central Git repository in Bitbucket Softonic Development flow
  15. • SOA Architecture allows to have smaller independent pieces •

    Each project uses its own source code repository based in Git • Central Git repository in Bitbucket • Using Github-flow like development application cycle Softonic Development flow
  16. • SOA Architecture allows to have smaller independent pieces •

    Each project uses its own source code repository based in Git • Central Git repository in Bitbucket • Using Github-flow like development application cycle • Continuous Integration: Each code push launches automatic tests (& staging env) Softonic Development flow
  17. • SOA Architecture allows to have smaller independent pieces •

    Each project uses its own source code repository based in Git • Central Git repository in Bitbucket • Using Github-flow like development application cycle • Continuous Integration: Each code push launches automatic tests (& staging env) • Continuous Deployment: Master goes to production “ALWAYS” Softonic Development flow
  18. Each project has its definition in a common language: •

    Docker Compose YAML files for Development ◦ docker-compose up -d • It should be enough for starting to develop! Softonic Development flow
  19. Each project has its definition in a common language: •

    Docker Compose YAML files for Development ◦ docker-compose up -d • It should be enough for starting to develop! • Helm Charts for Production • Chartmuseum for some helm dependencies Softonic Development flow
  20. Each project has its definition in a common language: •

    Docker Compose YAML files for Development ◦ docker-compose up -d • It should be enough for starting to develop! • Helm Charts for Production • Chartmuseum for some helm dependencies • Jenkins pipeline definition for CI/CD Softonic Development flow
  21. Each project has its definition in a common language: •

    Docker Compose YAML files for Development ◦ docker-compose up -d • It should be enough for starting to develop! • Helm Charts for Production • Chartmuseum for some helm dependencies • Jenkins pipeline definition for CI/CD • Infrastructure as Code using Terraform Softonic Development flow
  22. Each project has its definition in a common language: •

    Docker Compose YAML files for Development ◦ docker-compose up -d • It should be enough for starting to develop! • Helm Charts for Production • Chartmuseum for some helm dependencies • Jenkins pipeline definition for CI/CD • Infrastructure as Code using Terraform • Secrets encrypted in repository (SOPS+Helm secrets plugin) Softonic Development flow
  23. • Typical repo structure: $ ls -la 640 Apr 12

    11:34 . 4992 Apr 18 12:30 .. 136 Mar 15 12:51 .dockerignore 544 Apr 23 12:52 .git 129 Mar 26 11:10 .gitignore 120 Mar 15 12:51 .sops.yaml # SOPS secrets file definition (keyring to use) 7 Mar 26 11:10 BASE_VERSION # Project version file, matches repo version & docker image tags 2302 Apr 5 11:05 Jenkinsfile # CI/CD pipeline definition 160 Apr 12 11:34 bin # Some useful commands (for ex: generate a valid dev token) 288 Mar 29 09:49 build # Output of the build related stuff (PMD, code metrics, etc.) 192 Mar 15 12:46 docker # Docker images related to the project 5338 Apr 10 15:15 docker-compose.override.yml 3901 Apr 12 11:34 docker-compose.yml 128 Mar 15 14:55 docs # Markdown documentation files, integrated in project docs 96 Mar 15 12:51 helm # Project charts and requirements 832 Apr 16 17:55 laravel # Application source code 1381 Mar 15 12:51 mkdocs.yml # Docs definition file 256 Mar 26 11:10 terraform # Terraform manifests Softonic Development flow
  24. Softonic CI/CD Pipeline Automatic Job creation via Bitbucket plugin Each

    CI/CD which are essentially: • The same in many cases • Slightly different on some • Specific in rare cases We tried to abstract many behaviours in a Jenkins shared library. But before this...
  25. Softonic CI/CD Pipeline Typical steps: • Initialize • Launch tests

    • Get new release version number (SEMVER or whatever) • Build/push images • Notify build status to project repo & slack • Deploy on each target • Notify about the deployment to slack The virtual steps are basically these, but implementation can differ between services of different nature Example of Jenkinsfile that follows these steps
  26. Softonic CI/CD Pipeline 3 different examples, represent different use cases:

    A. Single Helm Chart: An API, with docs and mysql-backup dependencies B. 2 Helm charts, with 2 releases: Frontend application, with statics C. Operator: Magnet+projection, introduced with needed steps and initial helm chart commands.
  27. A. Softonic API typical deployment use case We have many

    different REST/GraphQL APIs Most of them have the same deployment process, consists on: • Ensure data dependencies/infrastructure exist: Created via Terraform • Deploy the API via Helm chart ◦ API via main chart ◦ Subchart: cronjob that does automatic backups of MySQL data, or persistent volumes ◦ Subchart: documentation via Helm subchart Pipeline abstracted to a Jenkins shared library
  28. A. Softonic Jenkins Shared Library • You just need to

    define some callbacks in your pipeline and some specific settings @Library('softonic-library@4') _ def pushCallback = { stage("Push images") { sh "docker login -u=$REGISTRY_AUTH_USR -p=$REGISTRY_AUTH_PSW your.registry.com" sh "docker-compose -f docker-compose.yml push" } } def preDeployCallback = { stage("Set helm values") { env.HELM_VALUES = "laravel.web.phpFpm.image.tag=${env.VERSION}" } }
  29. A. Softonic Jenkins Shared Library • You just need to

    define some callbacks in your pipeline and some specific settings @Library('softonic-library@4') _ def pushCallback = { stage("Push images") { sh "docker login -u=$REGISTRY_AUTH_USR -p=$REGISTRY_AUTH_PSW your.registry.com" sh "docker-compose -f docker-compose.yml push" } } def preDeployCallback = { stage("Set helm values") { env.HELM_VALUES = "laravel.web.phpFpm.image.tag=${env.VERSION}" } } deploy([ ... test: [ type: "laravel-base", settings: [ targetContainer: 'php-fpm', expectedReadyContainers: '6', codeDir: "app", containerProjectRoot: "/opt/app", slackChannel: "#jenkins-build-results", ], ], deploy: [ clusters: ["master","replicas”] chartName: "developer-api", environments: ["staging”, “production"], notifications: [slackChannel: "#team-platform-notif"], project: [ icon: ":developer:", name: "developer-api", releaseName: "developer-api", ], sites: ["sft", "xxx"], ], ] )
  30. A. Softonic Jenkins Shared Library • The shared library will

    call Terraform utility named Terraformer ◦ It’s a Helm-like application but for Terraform manifests ◦ It allows templating, useful when workspaces are not enough ◦ Searches for “values” and “secrets” files, predefined naming ◦ It uses sops with GCP KMS for managing the secrets, that are encrypted in the source repo Terraformer repo: OPEN SOURCE https://github.com/softonic/terraformer terraformer -f ${CONF} -s .${environment}.enc init &&\ (terraformer -f ${CONF} -s .${environment}.enc workspace select ${environment} ||\ terraformer -f ${CONF} -s .${environment}.enc workspace new ${environment}) &&\ terraformer -f ${CONF} -s .${environment}.enc validate &&\ terraformer -f ${CONF} -s .${environment}.enc apply --auto-approve
  31. A. Softonic Jenkins Shared Library • The shared library will

    call Terraform utility named Terraformer ◦ It’s a Helm-like application but for Terraform manifests ◦ It allows templating, useful when workspaces are not enough ◦ Searches for “values” and “secrets” files, predefined naming ◦ It uses sops with GCP KMS for managing the secrets, that are encrypted in the source repo • We abstract job parameters (and “labels”) for provide a common interface that allows initial provisioning. Useful for: ◦ New clusters ◦ Disaster recovery On each JOB: contexts: ["master”, “replicas"] terraformer -f ${CONF} -s .${environment}.enc init &&\ (terraformer -f ${CONF} -s .${environment}.enc workspace select ${environment} ||\ terraformer -f ${CONF} -s .${environment}.enc workspace new ${environment}) &&\ terraformer -f ${CONF} -s .${environment}.enc validate &&\ terraformer -f ${CONF} -s .${environment}.enc apply --auto-approve Terraformer repo: OPEN SOURCE https://github.com/softonic/terraformer
  32. A. Softonic Jenkins Shared Library • The shared library will

    call Terraform utility named Terraformer ◦ It’s a Helm-like application but for Terraform manifests ◦ It allows templating, useful when workspaces are not enough ◦ Searches for “values” and “secrets” files, predefined naming ◦ It uses sops with GCP KMS for managing the secrets, that are encrypted in the source repo • We abstract job parameters (and “labels”) for provide a common interface that allows initial provisioning. Useful for: ◦ New clusters ◦ Disaster recovery Terraformer repo: https://github.com/softonic/terraformer On each JOB: contexts: ["master”, “replicas"] terraformer -f ${CONF} -s .${environment}.enc init &&\ (terraformer -f ${CONF} -s .${environment}.enc workspace select ${environment} ||\ terraformer -f ${CONF} -s .${environment}.enc workspace new ${environment}) &&\ terraformer -f ${CONF} -s .${environment}.enc validate &&\ terraformer -f ${CONF} -s .${environment}.enc apply --auto-approve @Library('softonic-library@experimental-pipelines') _ def regions = ['eu-west', 'us-west', 'asia-southeast'] pipeline { ... parameters { string(name: 'CLUSTER', defaultValue: 'all', description: 'What k8s cluster to deploy to') } stages { ... stage('Deploy services') { steps { script { def jobs = getServicesJobs("replica") jobs.each { job -> build job: job.fullName, parameters: [ [$class: 'StringParameterValue', name: 'REGION', value: regions["${params.CLUSTER}"]], ], wait: false } } } } } }
  33. A. Softonic API typical deployment use case We have many

    different REST/GraphQL APIs Most of them have the same deployment process, consists on: • Ensure data dependencies exist: Created via Terraform • Deploy the API via Helm chart ◦ API via main chart ◦ Subchart: cronjob that does automatic backups of MySQL data, or persistent volumes ◦ Subchart: documentation via Helm subchart
  34. A. Softonic API typical deployment use case We have many

    different REST/GraphQL APIs Most of them have the same deployment process, consists on: • Ensure data dependencies exist: Created via Terraform OK • Deploy the API via Helm chart ◦ API via main chart ◦ Subchart: cronjob that does automatic backups of MySQL data, or persistent volumes ◦ Subchart: documentation via Helm subchart
  35. A. Softonic API typical deployment use case We have many

    different REST/GraphQL APIs Most of them have the same deployment process, consists on: • Ensure data dependencies exist: Created via Terraform OK • Deploy the API via Helm chart ◦ API via main chart ◦ Subchart: cronjob that does automatic backups of MySQL data, or persistent volumes ◦ Subchart: documentation via Helm subchart dependencies: - name: docs version: 2.0.2 repository: "@softonic" - name: k8s-snapshot-cronjob version: 0.1.5 repository: "@softonic" - name: mysql-backups version: 1.1.0 repository: "@softonic"
  36. A. Softonic API typical deployment use case We have many

    different REST/GraphQL APIs Most of them have the same deployment process, consists on: • Ensure data dependencies exist: Created via Terraform OK • Deploy the API via Helm chart ◦ API via main chart ◦ Subchart: cronjob that does automatic backups of MySQL data, or persistent volumes ◦ Subchart: documentation via Helm subchart dependencies: - name: docs version: 2.0.2 repository: "@softonic" - name: k8s-snapshot-cronjob version: 0.1.5 repository: "@softonic" - name: mysql-backups version: 1.1.0 repository: "@softonic" docs: enabled: true ingress: subdomain: developer-api image: repository: my.registry.com/developer-api/docs pullSecretName: registry-creds mysqlBackups: enabled: true mysql: database: developer_api host: X.X.X.X secret: name: credentials s3: bucket: mysqldump-sft-developer-api secret: name: mysqldump-aws-credentials
  37. A. Softonic API typical deployment use case We have many

    different REST/GraphQL APIs Most of them have the same deployment process, consists on: • Ensure data dependencies exist: Created via Terraform OK • Deploy the API via Helm chart ◦ API via main chart ◦ Subchart: cronjob that does automatic backups of MySQL data, or persistent volumes ◦ Subchart: documentation via Helm subchart dependencies: - name: docs version: 2.0.2 repository: "@softonic" - name: k8s-snapshot-cronjob version: 0.1.5 repository: "@softonic" - name: mysql-backups version: 1.1.0 repository: "@softonic" docs: enabled: true ingress: subdomain: developer-api image: repository: my.registry.com/developer-api/docs pullSecretName: registry-creds mysqlBackups: enabled: true mysql: database: developer_api host: X.X.X.X secret: name: credentials s3: bucket: mysqldump-sft-developer-api secret: name: mysqldump-aws-credentials Helm deployment command explained later
  38. B. Softonic Noodle deployment use case Noodle is our frontend

    service, what renders our webpage. NodeJS based, uses HAPI framework Uses grunt & webpack: generate static assets for JS/CSS/etc files When you modify a file it generates a new bundle using a hash to identify it uniquely, this allows you to cache the static files and just generate new files in case the content changes between releases
  39. • Deployment process here is a bit different than usual.

    • It's a bit more complicated because of how the deploys are done in a system like Kubernetes: Rolling Update • It is automatically deployed in production when you merge your changes to the master branch Warning Merge your changes always via a PULL REQUEST in the bitbucket project! It's based in Jenkinsfile and it deploys automatically your changes. 48 B. Softonic Noodle deployment use case
  40. 49 As this project is deployed in Kubernetes and uses

    a rolling update strategy there's a high probability of provoke 404s over static files while the update is running. B. Softonic Noodle deployment use case
  41. 50 • Imagine that the new version contains some new

    static files (CSS/JS/etc) • These statics are inside the new server (noodle) image • When one new Pod is running it can receive requests and the HTML will use the new static resources • The browser, once loaded the HTML will request these new static resources going to our CDN (Fastly), that goes to origin • Origin is the service in front of the server (noodle) pods • If service decides to route the request to one of the pods that are still executing the old version (remember we are in a middle of an update) the new static resource request will return a 404 error • The 404 error is then served to the browser who's not able to load the CSS/JS/etc failed file • Only when the deploy has finished completely we can be sure no 404 errors will be produced because of this For fixing this problem we have decided to do a 2-steps deploy of each new Noodle release B. Softonic Noodle deployment use case
  42. 51 Now we have 2 different helm charts: 1. Noodle

    statics • Contains all the CSS/JS/IMG/Fonts/etc static files in the image • They are copied in BUILD time from the last server and noodle-statics images • It's additive, each new version contains the earlier statics • TODO: We should decide how to automatically clean very old noodle-statics not useful anymore 2. Noodle server • Contains the application logic • During build time all the static content is available under a well-known directory • This directory is used by the noodle-statics image for obtain the static content B. Softonic Noodle deployment use case FROM softonic/nginx-vts:1.15.8-1 COPY ./rootfs / COPY --from=my.registry.com/noodle/statics:latest /www/data /www/data COPY --from=my.registry.com/noodle/server:latest /usr/src/app/public /www/data
  43. 53 This shows communication via MCI for noodle and regional

    ingress for statics (fastly origin). B. Architecture
  44. 54 This shows communication via MCI for noodle and regional

    ingress for statics (fastly origin). We need to be sure that the statics chart deploy is finished before start the server deploy. This way we can be sure that the new statics are ready to be served on any case. B. Architecture
  45. 55 This shows communication via MCI for noodle and regional

    ingress for statics (fastly origin). We need to be sure that the statics chart deploy is finished before start the server deploy. This way we can be sure that the new statics are ready to be served on any case. As Fastly is using the eu-west cluster statics deployment you need to be sure that this region is deployed before any other! B. Architecture
  46. 56 This shows communication via MCI for noodle and regional

    ingress for statics (fastly origin). We need to be sure that the statics chart deploy is finished before start the server deploy. This way we can be sure that the new statics are ready to be served on any case. As Fastly is using the eu-west cluster statics deployment you need to be sure that this region is deployed before any other! In case of failure of kube-eu-west cluster you'll need to change the ORIGIN.softonic.com DNS to the IP of the same ingress in other clusters, it affects the deployment order! Always the cluster with the active statics ingress needs to be deployed the first B. Architecture
  47. 1. There are 2 different projects: noodle and statics 2.

    There are 2 different deploys per cluster: deployment and staticsDeployment 3. The staticsDeployment needs to be deployed first! And we need to be sure it's finished before start the noodle deploy 57 @Library('[email protected]') _ def project = [ name: "noodle", version: "unknown", icon: ":ramen:", deployUrl: "https://en.softonic.com" ] def staticsProject = [ name: "statics", version: "unknown", icon: ":ramen:", deployUrl: "https://en.softonic.com" ] ... B. Jenkinsfile
  48. 1. There are 2 different projects: noodle and statics 2.

    There are 2 different deploys per cluster: deployment and staticsDeployment 3. The staticsDeployment needs to be deployed first! And we need to be sure it's finished before start the noodle deploy 58 @Library('[email protected]') _ def project = [ name: "noodle", version: "unknown", icon: ":ramen:", deployUrl: "https://en.softonic.com" ] def staticsProject = [ name: "statics", version: "unknown", icon: ":ramen:", deployUrl: "https://en.softonic.com" ] ... // Define the deployment values stage("Prepare Deployment") { ... steps { script { ... deployment = [ namespace: "${env.DEPLOY_NAMESPACE}", release: "${project['name']}", chartName: "${project['name']}", values: "image.server.tag=${env.DEPLOY_VERSION}", wait: false, debug: true ] staticsDeployment = [ namespace: "${env.DEPLOY_NAMESPACE}", release: "noodle-${staticsProject['name']}", chartName: "${staticsProject['name']}", values: "image.tag=${env.DEPLOY_VERSION}", wait: true, debug: true ] ... } } } B. Jenkinsfile
  49. 1. There are 2 different projects: noodle and statics 2.

    There are 2 different deploys per cluster: deployment and staticsDeployment 3. The staticsDeployment needs to be deployed first! And we need to be sure it's finished before start the noodle deploy 59 @Library('[email protected]') _ def project = [ name: "noodle", version: "unknown", icon: ":ramen:", deployUrl: "https://en.softonic.com" ] def staticsProject = [ name: "statics", version: "unknown", icon: ":ramen:", deployUrl: "https://en.softonic.com" ] ... // Define the deployment values stage("Prepare Deployment") { ... steps { script { ... deployment = [ namespace: "${env.DEPLOY_NAMESPACE}", release: "${project['name']}", chartName: "${project['name']}", values: "image.server.tag=${env.DEPLOY_VERSION}", wait: false, debug: true ] staticsDeployment = [ namespace: "${env.DEPLOY_NAMESPACE}", release: "noodle-${staticsProject['name']}", chartName: "${staticsProject['name']}", values: "image.tag=${env.DEPLOY_VERSION}", wait: true, debug: true ] ... } } } // On each Deployment step: stage("Deploy to production ASIA-SOUTHEAST") { ... steps { script { context = contexts[env.ACTIVE_CONTEXT] deploy this, context, staticsDeployment deploy this, context, deployment } } } B. Jenkinsfile
  50. B. Jenkinsfile: Deployment command 60 When the pipeline is executed

    you can see something like this: helm secrets upgrade --install \ --namespace noodle-v1 \ noodle-statics \ helm/statics \ --set image.tag=1.1985.0 \ -f helm/statics/values.production.cluster-eu-west.yaml \ -f helm/statics/secrets.values.production.yaml --wait helm secrets upgrade --install \ --namespace noodle-v1 \ noodle \ helm/noodle \ --set image.server.tag=1.1985.0 \ -f helm/noodle/values.production.yaml \ -f helm/noodle/values.production.cluster-eu-west.yaml \ -f helm/noodle/secrets.values.production.yaml If you have permissions to decrypt the project secrets you could execute these commands from your shell in case of an emergency Pay attention to the --wait flag for statics, it waits until the deploy is finished before execute the next command
  51. 61 The system searches for files matching some patterns to

    automatically use them as value files. In this example: • helm/noodle/values.yaml: Because Helm always loads this file • helm/noodle/values.production.yaml: Because the environment is production • helm/noodle/values.production.cluster-eu-west.yaml: Because the cluster to deploy is eu-west • helm/noodle/secrets.values.production.yaml: The same way as plain values files, in this case we don't need to override any secret value for an specific cluster, then the equivalent file for cluster-eu-west does not exist • Set values in --set flag: image.server.tag=1.1985.0 It merges each YAML file in this order, allowing you to redefine each value in the next files. B. Jenkinsfile: Deployment command
  52. 62 • We are testing a different way of deploy

    this application, use BLUE/GREEN deployment instead of Rolling Update. • It could allow us to avoid the 2-chart based deploy and save the “static” image but… as we are using a CDN that’s not simple. • And the Blue/green deployment is based in an Alpha feature provided by Argo Rollouts B. Bonus
  53. 63 • There are some cases where there’s a complex

    set of dependencies for deploy, maintain and run consistently an application • We have an application that needs all these steps to be deployed C. Operator use case: Magnet+Projection
  54. 64 • There are some cases where there’s a complex

    set of dependencies for deploy, maintain and run consistently an application • We have an application that needs all these steps to be deployed C. Operator use case: Magnet+Projection
  55. 65 C. Operator use case: Magnet+Projection Define this as ENV

    var to ensure the commands are launched to the right projection: version=23 1 - Ensure the environment variable GENERATE_PROJECTION=0 for appProjection and appProjectionProcess NOTE: you need to maintain it to 1 for appProjectionDump. 2 - Start processing messages scaling to 1 the appProjection consumer: kubectl --namespace apisoba-prj-${version} scale --replicas=1 statefulset.apps/apisoba-prj-${version}-projection-app-projection 3- Check all the messages have been processed and the queues are empty of messages: rabbit_user=nuokzqif rabbit_password=XXXXXXXX rabbit_host=<server endpoint> APISOBA_EVENTS_MSG=`rabbitmqadmin --username=$rabbit_user --password=$rabbit_password --host=$rabbit_host --ssl --port=443 --vhost=apisoba-${version} -f raw_json -d 3 list -S name queues name messages | jq -c '.[] | select(.name|test("apisoba-events."))' | jq -c .` while read i ; do MSG_COUNT=`echo $i | jq '.messages'` if [[ "$MSG_COUNT" -ne "0" ]]; then QUEUE=`echo $i | jq '.name'` echo "$QUEUE has still messages: $MSG_COUNT" false fi done <<< $APISOBA_EVENTS_MSG 4 - Once all the queues are empty, you need to scale down the appProjection for stop processing new incoming messages: kubectl --namespace apisoba-prj-${version} scale --replicas=0 statefulset.apps/apisoba-prj-${version}-projection-app-projection 5 - Now it's time to DUMP the REDIS view, for this you need to scale to one the "dump" process: dump_replicas=1 kubectl --namespace apisoba-prj-${version} scale --replicas=${dump_replicas} deployment.apps/apisoba-prj-${version}-projection-app-projection-dump 6 - But this still is not enough, this would process the programs 1 by one, but before doing it you need to know what these programs are, this is done executing the next commands: dump_pod=$(kubectl --namespace apisoba-prj-${version} get pods -l component=app-projection-dump -o=custom-columns=NAME:.metadata.name | grep -v NAME | head -n1) kubectl --namespace apisoba-prj-${version} exec -it "${dump_pod}" /docker-php-entrypoint php ./bin/refreshApplicationProjection.php ./config/container.php
  56. 66 Now the process should be running, after all the

    queues are empty, it could take some time, you can disable the "BULK MODE" and go to "LIVE MODE". 1 - Ensure the message ratio for the apisoba-events-* queues are low. rabbit_user=nuokzqif rabbit_password=XXXXXXXX rabbit_host=<server endpoint> rabbitmqadmin --username=$rabbit_user --password=$rabbit_password --host=$rabbit_host --ssl --port=443 --vhost=apisoba-${version} -f raw_json -d 3 list -S name queues name messages | jq -c '.[] | select(.name|test("apisoba-events."))' | jq -c . 2 - Scale down to 0 the projector process, it's work has finished! kubectl --namespace apisoba-prj-${version} scale --replicas=0 deployment.apps/apisoba-prj-projection-${version}-app-projection-dump 3 - Reduce projection resources, live mode does not need so much resources rabbit_user=nuokzqif rabbit_password=XXXXXXXX rabbit_host=<server endpoint> kubectl --namespace apisoba-prj-${version} scale --replicas=1 statefulset.apps/apisoba-prj-${version}-projection-app-projection-process for i in {1..99}; do \ kubectl --namespace apisoba-prj-${version} delete pod/apisoba-prj-${version}-projection-app-projection-process-$i rabbitmqadmin --host=$rabbit_host --port=443 --ssl \ --username=$rabbit_user --password=$rabbit_password --vhost=apisoba-${version} delete queue name=apisoba-events-$i; \ Done # CHANGE CONTEXT TO regional cluster! -> kctx EUROPE, USA or ASIA kctx EUROPE region=eu-west rabbit_host=<server endpoint> kubectl --namespace apisoba-prj-v${version} scale --replicas=1 statefulset.apps/apisoba-prj-persistence-v${version}-projection-pers-upserte for i in {1..9}; do \ kubectl --namespace apisoba-prj-v${version} delete pod/apisoba-prj-persistence-v${version}-projection-pers-upserte-$i; \ rabbitmqadmin --host=$rabbit_host --port=443 --ssl \ --username=$rabbit_user --password=$rabbit_password --vhost=apisoba-${version} delete queue name=apisoba-cmd-$region-$i; \ done C. Operator use case: Magnet+Projection
  57. 67 # CHANGE CONTEXT TO regional cluster! -> kctx EUROPE,

    USA or ASIA kctx USA region=us-west rabbit_host=<server endpoint> kubectl --namespace apisoba-prj-v${version} scale --replicas=1 statefulset.apps/apisoba-prj-persistence-v${version}-projection-pers-upserte for i in {1..9}; do \ kubectl --namespace apisoba-prj-v${version} delete pod/apisoba-prj-persistence-v${version}-projection-pers-upserte-$i; \ rabbitmqadmin --host=$rabbit_host --port=443 --ssl \ --username=$rabbit_user --password=$rabbit_password --vhost=apisoba-${version} delete queue name=apisoba-cmd-$region-$i; \ done # CHANGE CONTEXT TO regional cluster! -> kctx EUROPE, USA or ASIA kctx ASIA region=asia-southeast rabbit_host=<server endpoint> kubectl --namespace apisoba-prj-v${version} scale --replicas=1 statefulset.apps/apisoba-prj-persistence-v${version}-projection-pers-upserte for i in {1..9}; do \ kubectl --namespace apisoba-prj-v${version} delete pod/apisoba-prj-persistence-v${version}-projection-pers-upserte-$i; \ rabbitmqadmin --host=$rabbit_host --port=443 --ssl \ --username=$rabbit_user --password=$rabbit_password --vhost=apisoba-${version} delete queue name=apisoba-cmd-$region-$i; \ done # Go back to EUROPE to continue process kctx EUROPE 4 - GENERATE_PROJECTION: 1 for appProjection and appProjectionProcess kubectl edit statefulset.apps/apisoba-prj-v${version}-projection-app-projection ... kubectl edit statefulset.apps/apisoba-prj-v${version}-projection-app-projection-process ... 5 - Scale up appProjection for moving the processing to LIVE: kubectl --namespace apisoba-prj-v${version} scale --replicas=1 statefulset.apps/apisoba-prj-v${version}-projection-app-projection C. Operator use case: Magnet+Projection
  58. • Very complex deploy • Time intensive • Prone to

    human error C. Operator use case: Magnet+Projection
  59. • Very complex deploy • Time intensive • Prone to

    human error C. Operator use case: Magnet+Projection
  60. • Very complex deploy • Time intensive • Prone to

    human error C. Operator use case: Magnet+Projection
  61. • An Operator is a method of packaging, deploying and

    managing a K8s app • It’s deployed and managed using the Kubernetes APIs C. Operator use case: Magnet+Projection
  62. • Framework used: Kubebuilder • Custom Resource of the Kubernetes

    API: projections.projection.k8s.io C. Operator use case: Magnet+Projection apiVersion: projection.k8s.io/v1beta1 kind: Projection metadata: name: sft-apisoba namespace: sft-apisoba-projection-v28 spec: buses: internal: adminPassword: secretName: credentials adminUser: nuokzqif ... persistence: adminPassword: secretName: credentials eventsDb: name: event_store password: secretName: credentials stream: sft_events-v7 user: catalog_etl image: pullPolicy: Always pullSecret: projection-secret repository: my.registry.com/apisoba-prj/php-cli tag: 1.306.0 ... ... logLevel: info mode: live modes: bulk: parallelism: dump: 50 projection: 100 live: parallelism: projection: 1 redis: host: tcp://sft-apisoba-projection-v28-redis-master version: 28 status: state: live
  63. • Resources allow to store and retrieve structured data •

    With Custom Controller you can provide a declarative API • Operator pattern combines custom resources with custom controllers C. Operator use case: Magnet+Projection
  64. • Resources allow to store and retrieve structured data •

    With Custom Controller you can provide a declarative API • Operator pattern combines custom resources with custom controllers C. Operator use case: Magnet+Projection
  65. • Projection contains several StatefulSets and Deployments • We watch

    them! C. Operator use case: Magnet+Projection ... // Watch for changes to Projection err = c.Watch(&source.Kind{Type: &v1beta1.Projection{}}, &handler.EnqueueRequestForObject{}) ... // Watching resources err = c.Watch(&source.Kind{Type: &appsv1.StatefulSet{}}, &handler.EnqueueRequestForOwner{ IsController: true, OwnerType: &v1beta1.Projection{}, }) ... err = c.Watch(&source.Kind{Type: &appsv1.Deployment{}}, &handler.EnqueueRequestForOwner{ IsController: true, OwnerType: &v1beta1.Projection{}, }) ...
  66. • Projection contains several StatefulSets and Deployments • We watch

    them! C. Operator use case: Magnet+Projection ... // Watch for changes to Projection err = c.Watch(&source.Kind{Type: &v1beta1.Projection{}}, &handler.EnqueueRequestForObject{}) ... // Watching resources err = c.Watch(&source.Kind{Type: &appsv1.StatefulSet{}}, &handler.EnqueueRequestForOwner{ IsController: true, OwnerType: &v1beta1.Projection{}, }) ... err = c.Watch(&source.Kind{Type: &appsv1.Deployment{}}, &handler.EnqueueRequestForOwner{ IsController: true, OwnerType: &v1beta1.Projection{}, }) ...
  67. • We deploy it with a simple Helm command C.

    Operator use case: Magnet+Projection helm secrets upgrade --install \ --namespace projection-controller-system \ projection-controller \ helm/projection-controller \ --set image.tag=0.8.0 \ -f helm/projection-controller/values.yaml
  68. • We deploy it with a simple Helm command C.

    Operator use case: Magnet+Projection helm secrets upgrade --install \ --namespace projection-controller-system \ projection-controller \ helm/projection-controller \ --set image.tag=0.8.0 \ -f helm/projection-controller/values.yaml
  69. Number of Deployments $ helm history noodle --max 20 1156

    Mon May 27 10:34:44 2019 SUPERSEDED noodle-1.2235.0 Upgrade complete 1157 Mon May 27 11:18:09 2019 SUPERSEDED noodle-1.2236.0 Upgrade complete 1158 Mon May 27 17:35:45 2019 SUPERSEDED noodle-1.2237.0 Upgrade complete 1159 Mon May 27 17:50:20 2019 SUPERSEDED noodle-1.2238.0 Upgrade complete 1160 Mon May 27 17:58:28 2019 SUPERSEDED noodle-1.2239.0 Upgrade complete 1161 Mon May 27 18:02:29 2019 SUPERSEDED noodle-1.2238.0 Rollback to 1159 1162 Mon May 27 18:36:44 2019 SUPERSEDED noodle-1.2240.0 Upgrade complete 1163 Mon May 27 18:39:46 2019 SUPERSEDED noodle-1.2238.0 Rollback to 1159 1164 Mon May 27 18:49:34 2019 SUPERSEDED noodle-1.2241.0 Upgrade complete 1165 Tue May 28 09:51:07 2019 SUPERSEDED noodle-1.2242.0 Upgrade complete 1166 Tue May 28 10:31:34 2019 SUPERSEDED noodle-1.2243.0 Upgrade complete 1167 Tue May 28 10:50:00 2019 SUPERSEDED noodle-1.2244.0 Upgrade complete 1168 Tue May 28 11:37:25 2019 SUPERSEDED noodle-1.2245.0 Upgrade complete 1169 Tue May 28 13:01:25 2019 SUPERSEDED noodle-1.2246.0 Upgrade complete 1170 Tue May 28 13:21:34 2019 SUPERSEDED noodle-1.2247.0 Upgrade complete 1171 Tue May 28 13:52:59 2019 SUPERSEDED noodle-1.2248.0 Upgrade complete 1172 Tue May 28 14:53:17 2019 SUPERSEDED noodle-1.2249.0 Upgrade complete 1173 Tue May 28 15:12:18 2019 SUPERSEDED noodle-1.2250.0 Upgrade complete 1174 Tue May 28 17:07:29 2019 SUPERSEDED noodle-1.2251.0 Upgrade complete 1175 Tue May 28 18:49:07 2019 DEPLOYED noodle-1.2252.0 Upgrade complete Noodle project is deployed in production 10 times/day on each productive cluster: Only this project has 30 productive deployments per day
  70. Number of Deployments $ helm list NAME REVISION UPDATED STATUS

    CHART APP VERSION NAMESPACE acquired-traffic-proxy-v1 3 Mon Dec 3 16:09:39 2018 DEPLOYED acquired-traffic-proxy-1.0.0 acquired-traffic-proxy-v1 affiliation-api 16 Wed May 8 16:53:02 2019 DEPLOYED affiliation-api-1.0.1 affiliation-api apisoba-v3-release 162 Fri May 17 15:21:27 2019 DEPLOYED apisoba-3.0.0 3.0.0 apisoba-v3 apk-octopus 9 Wed Feb 6 17:03:36 2019 DEPLOYED apk-octopus-1.50.0 1.50.0 apk-octopus-v1 articles-worker 32 Tue Dec 11 12:26:47 2018 DEPLOYED articles-worker-1.1.0 1.6.0 articles-worker atl-internal-linker-api-v1 4 Tue May 14 16:50:49 2019 DEPLOYED internal-linker-api-1.0.0 atl-internal-linker-api-v1 atlas-api-release 185 Tue May 28 09:52:04 2019 DEPLOYED atlas-api-1.155.0 1.155.0 atlas-api atlas-release-v2 441 Wed May 29 13:42:33 2019 DEPLOYED atlas-1.716.0 1.716.0 atlas-v2 atlas-search-api 59 Tue May 28 10:56:05 2019 DEPLOYED atlas-search-api-0.25.0 0.25.0 atlas-search-api categories-api-v1 64 Wed Mar 27 14:17:33 2019 DEPLOYED categories-api-0.0.1 categories-api-v1 ek-apisoba 73 Fri May 3 16:52:22 2019 DEPLOYED ek-apisoba-0.0.6 ek-apisoba entrypoint-ng 71 Wed May 29 10:18:41 2019 DEPLOYED entrypoint-ng-1.70.0 1.70.0 entrypoint-v1 experiments-api-v1 46 Mon Apr 1 13:53:36 2019 DEPLOYED experiments-api-1.0.1 experiments-api-v1 experiments-consumer-v1 19 Tue Apr 2 12:42:56 2019 DEPLOYED experiments-consumer-1.0.0 experiments-consumer-v1 fhp-apisoba-projection-v28 2 Mon May 13 15:22:58 2019 DEPLOYED apisoba-projection-29.0.0 1.306.0 fhp-apisoba-projection-v28 fhp-apisoba-projection-v29 2 Mon May 20 18:18:34 2019 DEPLOYED apisoba-projection-29.0.0 1.306.0 fhp-apisoba-projection-v29 fhp-apisoba-projection-v30 1 Tue May 21 15:51:23 2019 DEPLOYED apisoba-projection-29.0.0 1.306.0 fhp-apisoba-projection-v30 fhp-apisoba-v3-release 23 Fri May 17 15:21:26 2019 DEPLOYED apisoba-3.0.0 3.0.0 fhp-apisoba-v3 forty-two-matters-v1 36 Mon Mar 18 16:26:40 2019 DEPLOYED forty-two-matters-0.0.1 forty-two-matters-v1 global-redirecter 34 Mon May 20 15:17:01 2019 DEPLOYED global-redirecter-0.1.2 1.0 global-redirecter helm-exporter 1 Tue Jan 8 09:53:52 2019 DEPLOYED helm-exporter-0.1.0 1.0 monitoring-v2 internal-linker-api-v1 69 Tue May 14 16:50:48 2019 DEPLOYED internal-linker-api-1.0.0 internal-linker-api-v1 istio-system 5 Wed May 29 11:15:52 2019 DEPLOYED istio-system-1.1.30 1.1.1 istio-system jenkins 53 Tue May 14 14:23:46 2019 DEPLOYED jenkins-0.4.13 2.121.3 jenkins k8s-volume-controller 1 Fri Sep 14 13:32:57 2018 DEPLOYED k8s-volume-controller-0.1.1 1.0 k8s-volume-controller kibana 18 Thu Jan 24 17:29:58 2019 DEPLOYED kibana-0.1.1 1.0 kibana logging 119 Thu May 9 17:32:25 2019 DEPLOYED logging-0.10.5 logging monitoring-v2 48 Fri May 10 12:37:00 2019 DEPLOYED monitoring-v2-0.8.4 monitoring-v2 noodle 1178 Wed May 29 12:52:42 2019 DEPLOYED noodle-1.2255.0 1.2255.0 noodle-v1 noodle-statics 348 Wed May 29 12:52:04 2019 DEPLOYED statics-1.2255.0 1.2255.0 noodle-v1 oauth-server 13 Tue Mar 12 12:53:44 2019 DEPLOYED oauth-server-1.0.1 oauth-server-v2 orchestration-docs 4 Thu Jan 24 16:42:23 2019 DEPLOYED orchestration-docs-1.0.0 6.0.0 orchestration-docs-v1 oscar-v1 87 Mon May 27 14:51:56 2019 DEPLOYED oscar-1.103.0 1.103.0 oscar-v1 projection-controller 1 Fri May 17 18:28:39 2019 DEPLOYED projection-controller-0.1.0 1.0 projection-controller-system rsyslog-tls 18 Wed May 15 13:17:53 2019 DEPLOYED rsyslog-tls-1.2.3 1.0 rsyslog-tls sft-cpi-api-v1 63 Wed May 29 15:32:52 2019 FAILED cpi-api-0.0.1 0.0.1 sft-cpi-api-v1 ... $ helm list | wc -l 124 Total number of deploys in our 3 productive clusters: 9396
  71. Number of Volatile Deployments $ kns | grep volatile volatile-atlas-at-1098

    volatile-atlas-at-1802 volatile-cpi-cms-dsk-000 volatile-xxx-dsk-1461 volatile-xxx-fhp-131 volatile-xxx-fhp-229 volatile-xxx-fhp-77 volatile-noodle-bb-879 volatile-noodle-bb-939 volatile-noodle-bb-956 volatile-noodle-bb-957 volatile-noodle-bb-958 volatile-noodle-bb-966 volatile-noodle-bb-972 volatile-noodle-cat-1010 volatile-noodle-cat-1224 volatile-noodle-cat-861 volatile-noodle-dsk-1157 volatile-noodle-dsk-1386 volatile-noodle-dsk-1413 volatile-noodle-dsk-1446 volatile-noodle-dsk-1457 volatile-noodle-fhp-98 volatile-noodle-soal-978 volatile-noodle-tech-646 volatile-noodle-tech-678 volatile-noodle-tech-684 • 27 running right now • More info about them: ◦ The name is obtained from the GIT branch name ◦ The branch name is generated from the JIRA task name ◦ When the branch is merged/removed the volatile environment is deleted
  72. Deployment time • It depends heavily on the application nature

    • We have deploys that take just a few seconds, complete pipeline execution less than 1 minute
  73. Deployment time • It depends heavily on the application nature

    • We have deploys that take just a few seconds, complete pipeline execution less than 1 minute
  74. Deployment time • Others are complex to be built •

    npm downloads the Internet • Deploy command (2-step deploy) can take 1 minute each
  75. Deployment time • Others are complex to be built •

    npm downloads the Internet • Deploy command (2-step deploy) can take 1 minute each
  76. Deployment time • Others are complex to be built •

    npm downloads the Internet • Deploy command (2-step deploy) can take 1 minute each • Deploy already finished (all Pods updated) after some more minutes!
  77. Deployment time • Others are complex to be built •

    npm downloads the Internet • Deploy command (2-step deploy) can take 1 minute each • Deploy already finished (all Pods updated) after some more minutes!