Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Chef's Operations Workflow

Chef's Operations Workflow

Slides from my presentation at Boulder DevOps meetup about the new workflow automation that Chef's Operations team has adopted.

Joshua Timberman

October 15, 2018
Tweet

More Decks by Joshua Timberman

Other Decks in Technology

Transcript

  1. Applications, sample set • Hosted Chef (api.chef.io) • Downloads (downloads.chef.io)

    • Omnitruck (omnitruck.chef.io) • Learn Chef (learn.chef.io) • Chef Community Cookbooks (supermarket.chef.io) • Internal and External utilities • "Gateway" and "glue" services • Kubernetes ;) • etc...
  2. slack: notify_channel: ops-notify github: delete_branch_on_merge: true # These are our

    Buildkite pipelines where deploys take place pipelines: - verify: description: Pull Request validation tests - deploy/acceptance: description: Deploy changes to kubernetes.chef.co pods definition: .expeditor/deploy.pipeline.yml env: - ENVIRONMENT: acceptance - APP: chef-web-util - deploy/production: description: Deploy changes to kubernetes.chef.co pods definition: .expeditor/deploy.pipeline.yml env: - ENVIRONMENT: production - APP: chef-web-util # These actions are taken when `/expeditor promote` is run from Slack promote: action: - bash:.expeditor/promote.sh - trigger_pipeline:deploy/production: only_if_conditions: - value_one: "{{target_channel}}" operator: equals value_two: stable - bash:.expeditor/purge-cdn.sh: post_commit: true channels: - acceptance - stable merge_actions: - built_in:trigger_habitat_package_build: post_commit: true ignore_labels: - "Habitat: Skip Build" - "Expeditor: Skip All" habitat_packages: - chef-web-util: origin: chefops export: - docker subscriptions: - workload: docker_image_published:chefops/chef-web-util:acceptance actions: .expeditor/config.yml
  3. pipelines: - verify: description: Pull Request validation tests - deploy/acceptance:

    description: Deploy changes to kubernetes definition: .expeditor/deploy.pipeline.yml env: - ENVIRONMENT: acceptance - APP: APPNAME - deploy/production: description: Deploy changes to kubernetes definition: .expeditor/deploy.pipeline.yml env: - ENVIRONMENT: production - APP: APPNAME .expeditor/config.yml: pipelines
  4. pipelines: - verify: description: Pull Request validation tests - deploy/acceptance:

    description: Deploy changes to kubernetes definition: .expeditor/deploy.pipeline.yml env: - ENVIRONMENT: acceptance - APP: APPNAME - deploy/production: description: Deploy changes to kubernetes definition: .expeditor/deploy.pipeline.yml env: - ENVIRONMENT: production - APP: APPNAME .expeditor/config.yml: pipelines .expeditor/verify.pipeline.yml
  5. pipelines: - verify: description: Pull Request validation tests - deploy/acceptance:

    description: Deploy changes to kubernetes definition: .expeditor/deploy.pipeline.yml env: - ENVIRONMENT: acceptance - APP: APPNAME - deploy/production: description: Deploy changes to kubernetes definition: .expeditor/deploy.pipeline.yml env: - ENVIRONMENT: production - APP: APPNAME .expeditor/config.yml: pipelines
  6. steps: - label: static command: | static syntax and lint

    checking commands (shellcheck, rubocop, etc) plugins: docker#v1.1.1: image: "chefes/buildkite" - label: unit #... unit testing commands, rspec, etc - label: audit #... for auditing bundles in ruby apps for example .expeditor/verify.pipeline.yml
  7. steps: - command: .expeditor/buildkite/kubernetes.sh label: "Kubernetes" concurrency: 1 concurrency_group: chef-nginx-demo-master/deploy/$ENVIRONMENT

    plugins: docker#v1.1.1: always-pull: true image: "chefes/buildkite" environment: - CHEF_CD_AWS_ACCESS_KEY_ID - CHEF_CD_AWS_SECRET_ACCESS_KEY - AWS_SSL_ARN - ENVIRONMENT - CI - APP - HAB_AUTH_TOKEN deploy.pipeline.yml
  8. chef-web-util: Source Path: /src chef-web-util: Installed Path: /hab/pkgs/chefops/chef-web-util/ 1.0.0/20181010192533 chef-web-util:

    Artifact: /src/results/chefops-chef-web- util-1.0.0-20181010192533-x86_64-linux.hart chef-web-util: Build Report: /src/results/last_build.env chef-web-util: SHA256 Checksum: a11540fb7908d4d1e67c56165b9a4388a0c10858aa729b029cc4aa9bf9770feb chef-web-util: Blake2b Checksum: 7a987da5536703ac1301d3c5ed17fe510dabf04d84b02ee052df1a1da7d6c467 chef-web-util: chef-web-util: I love it when a plan.sh comes together. chef-web-util: chef-web-util: Build time: 1m36s hab studio enter > build
  9. Habitat packages: • Are timestamp-based artifacts • Declare all their

    dependencies • Are "self contained" • https://habitat.sh
  10. Kubernetes is a portable, extensible open-source platform for managing containerized

    workloads and services, that facilitates both declarative configuration and automation.
  11. steps: - command: .expeditor/buildkite/kubernetes.sh label: "Kubernetes" concurrency: 1 concurrency_group: chef-nginx-demo-master/deploy/$ENVIRONMENT

    plugins: docker#v1.1.1: always-pull: true image: "chefes/buildkite" environment: - CHEF_CD_AWS_ACCESS_KEY_ID - CHEF_CD_AWS_SECRET_ACCESS_KEY - AWS_SSL_ARN - ENVIRONMENT - CI - APP - HAB_AUTH_TOKEN deploy.pipeline.yml
  12. #!/bin/bash set -euo pipefail if [[ ! -z ${CI+x} ]];

    then aws-configure chef-cd mkdir -p ~/.kube aws --profile chef-cd s3 cp s3://chef-cd-citadel/kubernetes.chef.co.config ~/.kube/config else echo "WARN: Not running in Buildkite, assuming local manual deployment" echo "WARN: This requires that ~/.kube/config exists with the proper content" fi export ENVIRONMENT=${ENVIRONMENT:-dev} export APP=${APP} DEBUG=${DEBUG:-false} # This block translates the "environment" into the appropriate Habitat # channel from which to deploy the packages if [ "$ENVIRONMENT" == "acceptance" ]; then export CHANNEL=acceptance elif [ "$ENVIRONMENT" == "production" ]; then export CHANNEL=stable elif [ "$ENVIRONMENT" == "dev" ] || [ "$ENVIRONMENT" == "test" ]; then export CHANNEL=unstable else echo "We do not currently support deploying to $ENVIRONMENT" exit 1 fi # We need the HAB_AUTH_TOKEN set (via Buildkite pipeline) for private packages get_image_tag() { results=$(curl --silent -H "Authorization: Bearer $HAB_AUTH_TOKEN" https://willem.habitat.sh/v1/depot/channels/chefops/${CHANNEL}/pkgs/${APP}/latest | jq '.ident') pkg_version=$(echo "$results" | jq -r .version) pkg_release=$(echo "$results" | jq -r .release) echo "${pkg_version}-${pkg_release}" } # Retrieves the ELB's public DNS name get_elb_hostname() { kubectl get services ${APP}-${ENVIRONMENT} --namespace=${APP} -o json 2>/dev/null | \ jq '.status.loadBalancer.ingress[].hostname' -r } # The ELB isn't ready until the hostname is set, so wait until it's ready wait_for_elb() { attempts=0 max_attempts=10 elb_host="" while [[ $attempts -lt $max_attempts ]]; do elb_host=$(get_elb_hostname || echo) if [[ ! -n $elb_host ]]; then echo "Did not find ELB yet... sleeping 5s" attempts=$[$attempts + 1] sleep 5 else echo "Found ELB: $elb_host" break fi done } # Used for debugging on a local workstation if [[ $DEBUG == "true" ]]; then echo "--- DEBUG: Environment" echo "Application: ${APP}" echo "Channel: ${CHANNEL}" echo "Environment: ${ENVIRONMENT}" fi echo "--- Applying kubernetes configuration for ${ENVIRONMENT} to cluster" IMAGE_TAG=$(get_image_tag) erb -T- kubernetes/deployment.yml | kubectl apply -f - if [[ `grep -c "^kind: Service$" kubernetes/deployment.yml` -gt 0 ]]; then echo "+++ Waiting for Load Balancer..." wait_for_elb fi kubernetes.sh
  13. .expeditor/buildkite/kubernetes.sh • Deploy pipeline in Buildkite runs the script •

    Copies credentials used for kubectl • Retrieves the Habitat package information from Builder • Applies the kubernetes/deployment.yml manifest • Waits until the ELB is available (if there is one) • Success!
  14. Kubernetes Resources • Pods • StatefulSets & Deployments • Services

    • Namespaces • Custom Resource Definitions (CRDs)
  15. --- apiVersion: v1 kind: Namespace metadata: name: nginx-demo --- apiVersion:

    habitat.sh/v1beta1 kind: Habitat metadata: name: nginx-demo-<%= ENV['ENVIRONMENT'] %> namespace: nginx-demo labels: app: nginx-demo-<%= ENV['ENVIRONMENT'] %> customVersion: v1beta2 spec: v1beta2: image: chefops/nginx-demo:<%= ENV['IMAGE_TAG'] %> count: 2 service: name: nginx-demo-<%= ENV['ENVIRONMENT'] %> topology: standalone --- apiVersion: v1 kind: Service metadata: name: nginx-demo-<%= ENV['ENVIRONMENT'] %> namespace: nginx-demo spec: selector: habitat-name: nginx-demo-<%= ENV['ENVIRONMENT'] %> ports: - name: http port: 80 targetPort: 80 protocol: TCP type: LoadBalancer Kubernetes: A Primer via Configuration
  16. --- apiVersion: v1 kind: Namespace metadata: name: nginx-demo kubernetes/deployment.yml -

    Namespace • Namespaces isolate resources in "virtual clusters" • Each application gets a namespace
  17. apiVersion: habitat.sh/v1beta1 kind: Habitat metadata: name: nginx-demo-<%= ENV['ENVIRONMENT'] %> namespace:

    nginx-demo labels: app: nginx-demo-<%= ENV['ENVIRONMENT'] %> customVersion: v1beta2 spec: v1beta2: image: chefops/nginx-demo:<%= ENV['IMAGE_TAG'] %> count: 2 service: name: nginx-demo-<%= ENV['ENVIRONMENT'] %> topology: standalone kubernetes/deployment.yml - Habitat • Applications are deployed using the Habitat Operator • This creates a StatefulSet with two pods • The pods will use the Docker image specified
  18. apiVersion: v1 kind: Service metadata: name: nginx-demo-<%= ENV['ENVIRONMENT'] %> namespace:

    nginx-demo spec: selector: habitat-name: nginx-demo-<%= ENV['ENVIRONMENT'] %> ports: - name: http port: 80 targetPort: 80 protocol: TCP type: LoadBalancer kubernetes/deployment.yml - Service
  19. % kubectl get all -n nginx-demo NAME READY STATUS RESTARTS

    AGE pod/nginx-demo-acceptance-0 1/1 Running 0 4m pod/nginx-demo-acceptance-1 1/1 Running 0 4m NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/nginx-demo-acceptance LoadBalancer 100.64.186.238 ac1532ae98ed5... 80:31027/TCP 4m NAME DESIRED CURRENT AGE statefulset.apps/nginx-demo-acceptance 2 2 4m Kubernetes resources are created
  20. apiVersion: kops/v1alpha2 kind: InstanceGroup metadata: labels: kops.k8s.io/cluster: kubernetes.chef-internal name: nodes

    spec: additionalUserData: - content: | #!/bin/sh - installs, configures, and runs chef name: chefbootstrap.sh type: text/x-shellscript cloudLabels: X-Application: kubernetes X-Contact: jtimberman X-Dept: Operations X-Environment: production image: kope.io/k8s-1.8-debian-jessie-amd64-hvm-ebs-2018-02-08 machineType: m4.xlarge maxSize: 6 minSize: 4 nodeLabels: kops.k8s.io/instancegroup: nodes role: Node subnets: - us-west-2a Kops nodes and masters configuration
  21. additionalUserData: - content: | #!/bin/sh apt-get -y update apt-get -y

    install python-pip pip install awscli mkdir -p /etc/chef /usr/local/bin/aws --region us-east-1 s3 cp s3://citadel-bucket/validation.pem /etc/chef/validation.pem INSTANCE_ID=$(curl -s 169.254.169.254/1.0/meta-data/instance-id) cat >/etc/chef/client.rb <<EOF node_name "$INSTANCE_ID" log_location "/var/log/chef/client.log" chef_server_url "https://api.chef.io/organizations/chef-utility" validation_client_name "chef-utility-validator" verify_api_cert true use_policyfile true policy_group "prod" policy_name "k8s-node" EOF curl -L https://omnitruck.chef.io/install.sh | sudo bash -s -- -v 14.4.56 mkdir /var/log/chef chef-client name: chefbootstrap.sh type: text/x-shellscript Additional User Data - chef bootstrap script
  22. cat >/etc/chef/client.rb <<EOF node_name "$INSTANCE_ID" log_location "/var/log/chef/client.log" chef_server_url "https://api.chef.io/organizations/chef-utility" validation_client_name

    "chef-utility-validator" verify_api_cert true use_policyfile true policy_group "prod" policy_name "k8s-node" EOF curl -L https://omnitruck.chef.io/install.sh | sudo bash -s chef-client Kops - chefbootstrap.sh
  23. k8s-node Chef Policyfile recipes • User management (LDAP, SSH Keys)

    • Centralized Logging • Send Inspec profile data to Chef Automate • Keep Chef package updated • Ensure Chef is running on a schedule
  24. Secrets Management • SSL certificates • API tokens • SSH

    keys • Configuration files • Repositories • AWS Accounts • Kubernetes
  25. Procedure 1.Make changes to the Repository, open a pull request

    2.Wait for Buildkite verification, peer approval, then merge 3.Wait for Expeditor to complete merge actions 4.Verify acceptance with stakeholders 5.Promote to Production
  26. Previous Procedure • Let's just say... • It was more

    than 5 steps • And most of them were manual • Unfortunate, really • For an Automation Company
  27. Make changes to the repository, open a pull request git

    checkout -b $USERNAME/my-branch # make some changes with your favorite editor git add . git commit -m 'updating the site with new features or fixes' git push origin $USERNAME/my-branch
  28. Procedure 1.Make changes to the Repository, open a pull request

    2.Wait for Buildkite verification, peer approval, then merge 3.Wait for Expeditor to complete merge actions 4.Verify acceptance with stakeholders 5.Promote to Production
  29. Questions? Joshua Timberman <[email protected]> @jtimberman Resources and Links https://www.habitat.sh https://www.lita.io/

    https://buildkite.com https://github.com/habitat-sh/habitat-operator https://github.com/coreos/prometheus-operator