Slide 1

Slide 1 text

Gianluca Arbezzano Site Reliability Engineer @InfluxData ● https://gianarb.it ● @gianarb What I like: ● I make dirty hacks that look awesome ● I grow my vegetables ● Travel for fun and work

Slide 2

Slide 2 text

@gianarb - [email protected]

Slide 3

Slide 3 text

@gianarb - [email protected]

Slide 4

Slide 4 text

@gianarb - [email protected]

Slide 5

Slide 5 text

No content

Slide 6

Slide 6 text

© 2018 InfluxData. All rights reserved. 6 @gianarb - gianluca@influxdb.com

Slide 7

Slide 7 text

© 2018 InfluxData. All rights reserved. 7 @gianarb - gianluca@influxdb.com DevOps likes automation. Automation likes code. YAML is not code.

Slide 8

Slide 8 text

Inspired by a true events

Slide 9

Slide 9 text

Kubernetes

Slide 10

Slide 10 text

1. You Know! Your team knows and use Docker for local development and testing 2. Kubernetes! Everyone speaks about kubernetes. 3. Hire! You don’t know why but you hired a DevOps that kind of know k8s. 3. Excitement! You are moving everything and everyone to kubernetes

Slide 11

Slide 11 text

We need to make our hands dirty

Slide 12

Slide 12 text

Spin up a cluster that you can break Bring developers in the loop

Slide 13

Slide 13 text

Deploy CI on Kubernetes Bring developers in the loop

Slide 14

Slide 14 text

Run your code in prod Bring developers in the loop

Slide 15

Slide 15 text

K8s as code: From YAML to code (golang) 1. You have the ability to use Golang autocomplete as documentation, reference for every kubernetes resources 2. You feel less a YAML engineer (great feeling btw) 3. Code is better than YAML! You can reuse it, compile it, embed it in other projects.

Slide 16

Slide 16 text

K8s as code: From YAML to code (golang) Tiny cli to make the migration to golang Some manual refactoring

Slide 17

Slide 17 text

K8s as code: From YAML to code (golang) Tiny cli to make the migration to golang Some manual refactoring ● Continue to improve our CI to validate that YAML and Go file are the same, and the resources in Kubernetes are like the Go file. ● Maybe we will be able to remove the YAML at some point.

Slide 18

Slide 18 text

GitOps Your Git repository is the entrypoint for all your code changes. Infrastructure is ‘as code’, so the place where you make it happen should be Git. Read More on weave.com https://www.weave.works/technologies/gitops/

Slide 19

Slide 19 text

The secret of success

Slide 20

Slide 20 text

Don’t be scared and write your own tools!

Slide 21

Slide 21 text

Why Kubernetes is so powerful, complex and widely adopted?

Slide 22

Slide 22 text

Why AWS is so expensive?

Slide 23

Slide 23 text

What do you do to justify these costs?

Slide 24

Slide 24 text

© 2018 InfluxData. All rights reserved. 24 @gianarb - gianluca@influxdb.com

Slide 25

Slide 25 text

© 2018 InfluxData. All rights reserved. 25 @gianarb - gianluca@influxdb.com apiVersion: extensions/v1beta1 kind: Deployment metadata: name: {{ template "drone.fullname" . }}-agent labels: app: {{ template "drone.name" . }} chart: "{{ .Chart.Name }}-{ .Chart.Version }}" release: "{{ .Release.Name }}" heritage: "{{ .Release.Service }}" component: agent spec: replicas: {{ .Values.agent.replicas }} template: metadata: annotations: checksum/secrets: {{ include (print $.Template.BasePath "/secrets.yaml") . | sha256sum }} {{- if .Values.agent.annotations } {{ toYaml .Values.agent.annotations | indent 8 } {{- end }} labels: app: {{ template "drone.name" . }} release: "{{ .Release.Name }}" component: agent

Slide 26

Slide 26 text

API are the keys for your success! Image credit: Pixabay

Slide 27

Slide 27 text

© 2018 InfluxData. All rights reserved. 27 @gianarb - gianluca@influxdb.com

Slide 28

Slide 28 text

© 2018 InfluxData. All rights reserved. 28 @gianarb - gianluca@influxdb.com

Slide 29

Slide 29 text

© 2018 InfluxData. All rights reserved. 29 @gianarb - gianluca@influxdb.com containerd.io

Slide 30

Slide 30 text

© 2018 InfluxData. All rights reserved. 30 @gianarb - gianluca@influxdb.com

Slide 31

Slide 31 text

© 2018 InfluxData. All rights reserved. 31 @gianarb - gianluca@influxdb.com

Slide 32

Slide 32 text

© 2018 InfluxData. All rights reserved. 32 @gianarb - gianluca@influxdb.com

Slide 33

Slide 33 text

© 2018 InfluxData. All rights reserved. 33 @gianarb - gianluca@influxdb.com

Slide 34

Slide 34 text

© 2018 InfluxData. All rights reserved. 34 @gianarb - gianluca@influxdb.com

Slide 35

Slide 35 text

© 2018 InfluxData. All rights reserved. 35 @gianarb - gianluca@influxdb.com We use docker as replacement for systemd for process management

Slide 36

Slide 36 text

© 2018 InfluxData. All rights reserved. 36 @gianarb - gianluca@influxdb.com DIND - Docker in Docker $ docker run -it --rm -v /var/run/docker.sock:/var/run/docker.sock docker sh $ docker info Containers: 48 Running: 1 Paused: 0 Stopped: 47 containerd version: 9f2e07b1fc1342d1c48fe4d7bbb94cb6d1bf278b.m runc version: 871ba2e58e24314d1fab4517a80410191ba5ad01 init version: fec3683 Kernel Version: 4.20.13-arch1-1-ARCH Operating System: Arch Linux OSType: linux Architecture: x86_64 CPUs: 4 Total Memory: 15.42GiB Name: gianarb

Slide 37

Slide 37 text

UX for OPS. Because everyone needs to feel like at home...

Slide 38

Slide 38 text

Instrumentation, observability and monitoring

Slide 39

Slide 39 text

~ @gianarb - https://gianarb.it ~ The secret is all about how do you combine things together

Slide 40

Slide 40 text

~ @gianarb - https://gianarb.it ~ Metric s

Slide 41

Slide 41 text

~ @gianarb - https://gianarb.it ~ Logs

Slide 42

Slide 42 text

~ @gianarb - https://gianarb.it ~ Traces

Slide 43

Slide 43 text

~ @gianarb - https://gianarb.it ~ Often our aggregations looks a bit twisted...

Slide 44

Slide 44 text

@gianarb - [email protected] Distributed Tracing Tracing is a way to correlate logs using a set of IDs

Slide 45

Slide 45 text

@gianarb - [email protected]

Slide 46

Slide 46 text

Normal state vs Current state

Slide 47

Slide 47 text

Instrumentation code is a first citizen in your codebase: OpenCensus ● Open Source project sponsored by Google ● It is a SPEC plus a set of libraries in different languages to instrument your application ● To collect metrics, traces and events.

Slide 48

Slide 48 text

OpenCensus Common Interface to collect stats and traces from your app Different exporters to persist your data

Slide 49

Slide 49 text

gianarb.it ~ @gianarb # HELP http_requests_total The total number of HTTP requests. # TYPE http_requests_total counter http_requests_total{method="post",code="200"} 1027 1395066363000 http_requests_total{method="post",code="400"} 3 1395066363000 # Escaping in label values: msdos_file_access_time_seconds{path="C:\\DIR\\FILE.TXT",error="Cannot find file:\n\"FILE.TXT\""} 1.458255915e9 # Minimalistic line: metric_without_timestamp_and_labels 12.47 # A weird metric from before the epoch: something_weird{problem="division by zero"} +Inf -3982045 # A histogram, which has a pretty complex representation in the text format: # HELP http_request_duration_seconds A histogram of the request duration. # TYPE http_request_duration_seconds histogram http_request_duration_seconds_bucket{le="0.05"} 24054 http_request_duration_seconds_bucket{le="0.1"} 33444 http_request_duration_seconds_bucket{le="0.2"} 100392 http_request_duration_seconds_bucket{le="0.5"} 129389 http_request_duration_seconds_bucket{le="1"} 133988 http_request_duration_seconds_bucket{le="+Inf"} 144320 http_request_duration_seconds_sum 53423 http_request_duration_seconds_count 144320

Slide 50

Slide 50 text

OpenMetrics v2 Prometheus exposition format

Slide 51

Slide 51 text

gianarb.it ~ @gianarb func FetchMetricFamilies(url string, ch chan<- *dto.MetricFamily, certificate string, key string, skipServerCertCheck bool) error { defer close(ch) var transport *http.Transport if certificate != "" && key != "" { cert, err := tls.LoadX509KeyPair(certificate, key) if err != nil { return err } tlsConfig := &tls.Config{ Certificates: []tls.Certificate{cert}, InsecureSkipVerify: skipServerCertCheck, } tlsConfig.BuildNameToCertificate() transport = &http.Transport{TLSClientConfig: tlsConfig} } else { transport = &http.Transport{ TLSClientConfig: &tls.Config{InsecureSkipVerify: skipServerCertCheck}, } } https://github.com/prometheus/prom2json/blob/master/prom2json.go#L123

Slide 52

Slide 52 text

Summary: ★ Do not be scared to write your code! ★ Use the API ★ Instrumentation code is a first class citizen ★ Keep calm and observe all together!

Slide 53

Slide 53 text

© 2018 InfluxData. All rights reserved. 53 @gianarb - gianluca@influxdb.com Credits and Links ¨ https://www.weave.works/technologies/gitops/ ¨ http://gianarb.it ¨ https://thenewstack.io/why-you-cant-afford-to-ignore-distributed-tracing-for-observability/ ¨ https://www.honeycomb.io/blog/ ¨ https://gianarb.it/blog/infra-as-code-short-long-ttl-resource ¨ https://gianarb.it/blog/kubernetes-shared-informer ¨ https://github.com/OpenObservability/OpenMetrics ¨ https://promcon.io/2018-munich/slides/openmetrics-transforming-the-prometheus-exposition-format-into-a -global-standard.pdf ¨ https://medium.com/opentracing/merging-opentracing-and-opencensus-f0fe9c7ca6f0

Slide 54

Slide 54 text

~ @gianarb - https://gianarb.it ~ Thanks @gianarb