Slide 1

Slide 1 text

Gianluca Arbezzano - @gianarb - SRE at InfluxData Infrastructure as code should contain code

Slide 2

Slide 2 text

Gianluca Arbezzano Site Reliability Engineer @InfluxData ● https://gianarb.it ● @gianarb What I like: ● I make dirty hacks that look awesome ● I grow my vegetables ● Travel for fun and work

Slide 3

Slide 3 text

@gianarb - [email protected]

Slide 4

Slide 4 text

@gianarb - [email protected]

Slide 5

Slide 5 text

No content

Slide 6

Slide 6 text

No content

Slide 7

Slide 7 text

© 2018 InfluxData. All rights reserved. 7 @gianarb - [email protected] github.com/influxdata

Slide 8

Slide 8 text

I am not against YAML/JSON but against unmaintainable YAML/JSON.

Slide 9

Slide 9 text

© 2018 InfluxData. All rights reserved. 9 @gianarb - [email protected]

Slide 10

Slide 10 text

© 2018 InfluxData. All rights reserved. 10 @gianarb - [email protected] The automation’s goals ¨ The ability to play repeatable tasks with a single click or continuously. ¨ The possibility to build pipelines using products like Kubernetes, AWS, Jenkins ¨ We would like to have a friendly UX to manage “ops”. Image credit: Pixabay

Slide 11

Slide 11 text

Why Kubernetes is so powerful, complex and widely adopted? widely

Slide 12

Slide 12 text

Why AWS is so expensive? widely

Slide 13

Slide 13 text

What do you do to justify these costs?

Slide 14

Slide 14 text

© 2018 InfluxData. All rights reserved. 14 @gianarb - [email protected]

Slide 15

Slide 15 text

© 2018 InfluxData. All rights reserved. 15 @gianarb - [email protected] apiVersion: extensions/v1beta1 kind: Deployment metadata: name: {{ template "drone.fullname" . }}-agent labels: app: {{ template "drone.name" . }} chart: "{{ .Chart.Name }}-{ .Chart.Version }}" release: "{{ .Release.Name }}" heritage: "{{ .Release.Service }}" component: agent spec: replicas: {{ .Values.agent.replicas }} template: metadata: annotations: checksum/secrets: {{ include (print $.Template.BasePath "/secrets.yaml") . | sha256sum }} {{- if .Values.agent.annotations } {{ toYaml .Values.agent.annotations | indent 8 } {{- end }} labels: app: {{ template "drone.name" . }} release: "{{ .Release.Name }}" component: agent

Slide 16

Slide 16 text

API are the keys for your success! Image credit: Pixabay

Slide 17

Slide 17 text

What do I mean with “code”

Slide 18

Slide 18 text

© 2018 InfluxData. All rights reserved. 18 @gianarb - [email protected] Short TTL Long TTL don’t change very often They never change they have a controllable “state” Change frequently They have a state controllable by outside Depend from the external

Slide 19

Slide 19 text

© 2018 InfluxData. All rights reserved. 19 @gianarb - [email protected]

Slide 20

Slide 20 text

YAML PRO: You don’t need to know a programming language to use it

Slide 21

Slide 21 text

© 2018 InfluxData. All rights reserved. 21 @gianarb - [email protected] So you are not like me always with the k8s’s docs or AWS’s doc open to look at YAML formats?

Slide 22

Slide 22 text

Code PRO: A Go struct brings compile time validation and inline docs (it means no unknown fields)

Slide 23

Slide 23 text

© 2018 InfluxData. All rights reserved. 23 @gianarb - [email protected] Write friendly utility to manipulate the resources // Create the pull of services you need to deploy apps := []runtime.Object{} apps = append(apps, etcd.New()...) apps = append(apps, monitor.New()...) apps = append(apps, kafka.New()...) // Declare the number of transformation you need to apply to the declared resources ops = append(ops, WithNoLimits(), WithMaxReplicas(1), WithReplicas(map[string]int{ "etcd": 3, }), WithNoAffinity(), WithNoNodePorts()) // Apply Transformation apps = Transform(ops...)(apps) // Deploy the transformation using client-sdk

Slide 24

Slide 24 text

© 2018 InfluxData. All rights reserved. 24 @gianarb - [email protected] Test what you expect avoids mistakes prod := []runtime.Object{} prod = append(prod, service1.New()...) prod = append(prod, service2.New()...) prod = append(prod, mysql.New()...) prod = append(prod, kafka.New()...) ops = append(ops, WithMaxReplicas(1), WithReplicas(map[string]int{ "service1": 3, "service2": 21, })) apps = Transform(ops...)(prod) // You can write a unit test to enforce what you need in the // prod environment or to avoid to change something that // should not be changed.

Slide 25

Slide 25 text

© 2018 InfluxData. All rights reserved. 25 @gianarb - [email protected] // WithReplicas matches the serviceReplicas key to the statefulset service name // and sets the number of replicas to the value. func WithReplicas(serviceReplicas map[string]int) Op { return func(objs []runtime.Object) []runtime.Object { for _, o := range objs { switch app := o.(type) { case *appsv1.StatefulSet: for name, replicas := range serviceReplicas { if app.Spec.ServiceName == name { r := int32(replicas) app.Spec.Replicas = &r } } } } return objs } }

Slide 26

Slide 26 text

© 2018 InfluxData. All rights reserved. 26 @gianarb - [email protected] Develop your pipeline ¨ Monitorable ¨ Reproducible ¨ Easy to expand ¨ Documentative

Slide 27

Slide 27 text

Automatically set labels or taints to K8S from EC2 tags

Slide 28

Slide 28 text

© 2018 InfluxData. All rights reserved. 28 @gianarb - [email protected] Kubernetes cluster in AWS / Labels and taints from EC2 tags Every autoscaling group can passthrough tags to the underline EC2. We wrote a Golang application that uses a shared informer to listen for new registered node.

Slide 29

Slide 29 text

© 2018 InfluxData. All rights reserved. 29 @gianarb - [email protected] Kubernetes cluster in AWS / Labels and taints from EC2 tags When a new EC2 joins the cluster we catched the event and if the EC2 contains specific tags we apply them as tants or labels on the kubelet

Slide 30

Slide 30 text

© 2018 InfluxData. All rights reserved. 30 @gianarb - [email protected] Kubernetes cluster in AWS / Labels and taints from EC2 tags EC2 tag: kubernetes/aws-labeler/label/role=ci K8S label: awslabeler.influxdata.com/role=ci EC2 tag: kubernetes/aws-labeler/taint/role=ci:NoSchedule K8S Taint: awslabeler.influxdata.com/role=ci:NoSchedule

Slide 31

Slide 31 text

© 2018 InfluxData. All rights reserved. 31 @gianarb - [email protected] Kubernetes cluster in AWS / Long TTL 1. Security groups 2. VPC 3. Subnets 4. Route53 zones 5. AMI

Slide 32

Slide 32 text

© 2018 InfluxData. All rights reserved. 32 @gianarb - [email protected] AWS Autoscaling Group / Short TTL 1. AWS Autoscaling Groups to guarantee the expected number of workers 2. Every Autoscaling Group can run different Kubernetes versions, it can have different labels, instance type, security groups, AMI and so on. 3. Autoscaling group are provided as a service from AWS. No maintenance at all

Slide 33

Slide 33 text

© 2018 InfluxData. All rights reserved. 33 @gianarb - [email protected] Kubernetes cluster in AWS

Slide 34

Slide 34 text

© 2018 InfluxData. All rights reserved. 34 @gianarb - [email protected] Kubernetes cluster in AWS / Master 1. The master is manually provisioned 2. We use kubernetes cronjob to backup the etcd cluster 3. We are practical folks. We know we can do better but for now we are happy with what we have for now

Slide 35

Slide 35 text

UserData and immutable infrastructure reduces the need for a config management tool like Ansible or Chef.

Slide 36

Slide 36 text

Spit your provisioning by layers using Packer or LinuxKit to have a base image and UserData to specialize it

Slide 37

Slide 37 text

© 2018 InfluxData. All rights reserved. 37 @gianarb - [email protected] CoreOS supports ignition (ok this is JSON.. but it is not too much JSON!) { "ignition": {"version": "2.1.0"}, "storage": { "directories": [{"filesystem": "root", "path": "/etc/kubernetes/manifests"}, "files": [{ "filesystem": "root", "path": "/etc/ssl/AmazonRootCA1.pem", "contents": {"source": "https://www.amazontrust.com/repository/AmazonRootCA1.pem"} }] }, "systemd": { … } }

Slide 38

Slide 38 text

Shorter is the ttl for a single server and more secure it is

Slide 39

Slide 39 text

© 2018 InfluxData. All rights reserved. 39 @gianarb - [email protected] Wrap up ¨ JSON and YAML are not the problem ¨ Long TTL vs Short TTL ¨ Use the API! ¨ Patterns and methodology to write good infra as code ¨ Testing to enforce safety ¨ Inline documentation ¨ Code Review ¨ Immutable Infrastructure

Slide 40

Slide 40 text

© 2018 InfluxData. All rights reserved. 40 @gianarb - [email protected] Credits and links ¨ https://gianarb.it/blog/infrastructure-as-real-code ¨ https://twitter.com/danveloper/status/1078870433246662656 ¨ https://blog.couchbase.com/kubernetes-operators-game-changer/ ¨ https://gianarb.it/blog/reactive-planning-is-a-cloud-native-pattern ¨ https://engineering.bitnami.com/articles/a-deep-dive-into-kubernetes-controllers.html

Slide 41

Slide 41 text

© 2018 InfluxData. All rights reserved. 41 @gianarb - [email protected] Reach out: @gianarb [email protected] https://gianarb.it Any question?