Upgrade to Pro — share decks privately, control downloads, hide ads and more …

CfgMgmtCamp - Infrastructure as code should contain code

CfgMgmtCamp - Infrastructure as code should contain code

These days “infrastructure as code” means HCL, YAML, JSON.
I will never buy that JSON is a programming language.
Cloud Formation tries so hard mixing JSON with keywords that runtime become functions saying that it is a maintainable approach.
Helm pushes hard saying that YAML with some parameters that runtime get translated to a variable is a flexible and maintainable approach.
Infrastructure as code means that you are supposed to use a programming language because otherwise, it won’t work.
Some YAML evangelists will tell you that a “human-friendly data serialization standard” is better than code because it will keep you out from writing weird and wrong code that you are not supposed to write doing infrastructure.
The truth is that you need to code better! At InfluxData we know that, and this talk is about our journey moving from YAML to Go to manage our Kubernetes cluster.
What we faced and why we think it is way better!

Gianluca Arbezzano

February 05, 2019
Tweet

More Decks by Gianluca Arbezzano

Other Decks in Technology

Transcript

  1. Gianluca Arbezzano - @gianarb - SRE at InfluxData
    Infrastructure as code
    should contain code

    View Slide

  2. Gianluca Arbezzano
    Site Reliability Engineer @InfluxData
    ● https://gianarb.it
    ● @gianarb
    What I like:
    ● I make dirty hacks that look awesome
    ● I grow my vegetables
    ● Travel for fun and work

    View Slide

  3. @gianarb - [email protected]

    View Slide

  4. @gianarb - [email protected]

    View Slide

  5. View Slide

  6. View Slide

  7. © 2018 InfluxData. All rights reserved.
    7 @gianarb - [email protected]
    github.com/influxdata

    View Slide

  8. I am not against YAML/JSON
    but against unmaintainable
    YAML/JSON.

    View Slide

  9. © 2018 InfluxData. All rights reserved.
    9 @gianarb - [email protected]

    View Slide

  10. © 2018 InfluxData. All rights reserved.
    10 @gianarb - [email protected]
    The automation’s goals
    ¨ The ability to play repeatable tasks with a
    single click or continuously.
    ¨ The possibility to build pipelines using
    products like Kubernetes, AWS, Jenkins
    ¨ We would like to have a friendly UX to
    manage “ops”.
    Image credit: Pixabay

    View Slide

  11. Why Kubernetes
    is so powerful, complex
    and widely adopted?
    widely

    View Slide

  12. Why AWS
    is so expensive?
    widely

    View Slide

  13. What do you do
    to justify these costs?

    View Slide

  14. © 2018 InfluxData. All rights reserved.
    14 @gianarb - [email protected]

    View Slide

  15. © 2018 InfluxData. All rights reserved.
    15 @gianarb - [email protected]
    apiVersion: extensions/v1beta1
    kind: Deployment
    metadata:
    name: {{ template "drone.fullname" . }}-agent
    labels:
    app: {{ template "drone.name" . }}
    chart: "{{ .Chart.Name }}-{ .Chart.Version }}"
    release: "{{ .Release.Name }}"
    heritage: "{{ .Release.Service }}"
    component: agent
    spec:
    replicas: {{ .Values.agent.replicas }}
    template:
    metadata:
    annotations:
    checksum/secrets: {{ include (print $.Template.BasePath "/secrets.yaml") . | sha256sum }}
    {{- if .Values.agent.annotations }
    {{ toYaml .Values.agent.annotations | indent 8 }
    {{- end }}
    labels:
    app: {{ template "drone.name" . }}
    release: "{{ .Release.Name }}"
    component: agent

    View Slide

  16. API are
    the keys for
    your success!
    Image credit: Pixabay

    View Slide

  17. What do I mean
    with “code”

    View Slide

  18. © 2018 InfluxData. All rights reserved.
    18 @gianarb - [email protected]
    Short TTL Long TTL
    don’t
    change very
    often
    They never
    change
    they have a
    controllable
    “state”
    Change
    frequently
    They have a
    state
    controllable
    by outside
    Depend
    from the
    external

    View Slide

  19. © 2018 InfluxData. All rights reserved.
    19 @gianarb - [email protected]

    View Slide

  20. YAML PRO:
    You don’t need to know a
    programming language to use it

    View Slide

  21. © 2018 InfluxData. All rights reserved.
    21 @gianarb - [email protected]
    So you are not like me always with the k8s’s docs or AWS’s doc
    open to look at YAML formats?

    View Slide

  22. Code PRO:
    A Go struct brings compile time
    validation and inline docs
    (it means no unknown fields)

    View Slide

  23. © 2018 InfluxData. All rights reserved.
    23 @gianarb - [email protected]
    Write friendly utility to manipulate the resources
    // Create the pull of services you need to deploy
    apps := []runtime.Object{}
    apps = append(apps, etcd.New()...)
    apps = append(apps, monitor.New()...)
    apps = append(apps, kafka.New()...)
    // Declare the number of transformation you need to apply to the declared resources
    ops = append(ops,
    WithNoLimits(),
    WithMaxReplicas(1),
    WithReplicas(map[string]int{
    "etcd": 3,
    }),
    WithNoAffinity(),
    WithNoNodePorts())
    // Apply Transformation
    apps = Transform(ops...)(apps)
    // Deploy the transformation using client-sdk

    View Slide

  24. © 2018 InfluxData. All rights reserved.
    24 @gianarb - [email protected]
    Test what you expect avoids mistakes
    prod := []runtime.Object{}
    prod = append(prod, service1.New()...)
    prod = append(prod, service2.New()...)
    prod = append(prod, mysql.New()...)
    prod = append(prod, kafka.New()...)
    ops = append(ops,
    WithMaxReplicas(1),
    WithReplicas(map[string]int{
    "service1": 3,
    "service2": 21,
    }))
    apps = Transform(ops...)(prod)
    // You can write a unit test to enforce what you need in the
    // prod environment or to avoid to change something that //
    should not be changed.

    View Slide

  25. © 2018 InfluxData. All rights reserved.
    25 @gianarb - [email protected]
    // WithReplicas matches the serviceReplicas key to the statefulset service name
    // and sets the number of replicas to the value.
    func WithReplicas(serviceReplicas map[string]int) Op {
    return func(objs []runtime.Object) []runtime.Object {
    for _, o := range objs {
    switch app := o.(type) {
    case *appsv1.StatefulSet:
    for name, replicas := range serviceReplicas {
    if app.Spec.ServiceName == name {
    r := int32(replicas)
    app.Spec.Replicas = &r
    }
    }
    }
    }
    return objs
    }
    }

    View Slide

  26. © 2018 InfluxData. All rights reserved.
    26 @gianarb - [email protected]
    Develop your pipeline
    ¨ Monitorable
    ¨ Reproducible
    ¨ Easy to expand
    ¨ Documentative

    View Slide

  27. Automatically set labels or taints
    to K8S from EC2 tags

    View Slide

  28. © 2018 InfluxData. All rights reserved.
    28 @gianarb - [email protected]
    Kubernetes cluster in AWS / Labels and taints from EC2 tags
    Every autoscaling group can passthrough
    tags to the underline EC2.
    We wrote a Golang application that uses a
    shared informer to listen for new registered
    node.

    View Slide

  29. © 2018 InfluxData. All rights reserved.
    29 @gianarb - [email protected]
    Kubernetes cluster in AWS / Labels and taints from EC2 tags
    When a new EC2 joins the cluster we
    catched the event and if the EC2
    contains specific tags we apply them as
    tants or labels on the kubelet

    View Slide

  30. © 2018 InfluxData. All rights reserved.
    30 @gianarb - [email protected]
    Kubernetes cluster in AWS / Labels and taints from EC2 tags
    EC2 tag: kubernetes/aws-labeler/label/role=ci
    K8S label: awslabeler.influxdata.com/role=ci
    EC2 tag: kubernetes/aws-labeler/taint/role=ci:NoSchedule
    K8S Taint: awslabeler.influxdata.com/role=ci:NoSchedule

    View Slide

  31. © 2018 InfluxData. All rights reserved.
    31 @gianarb - [email protected]
    Kubernetes cluster in AWS / Long TTL
    1. Security groups
    2. VPC
    3. Subnets
    4. Route53 zones
    5. AMI

    View Slide

  32. © 2018 InfluxData. All rights reserved.
    32 @gianarb - [email protected]
    AWS Autoscaling Group / Short TTL
    1. AWS Autoscaling Groups to guarantee the expected number of workers
    2. Every Autoscaling Group can run different Kubernetes versions, it can have different
    labels, instance type, security groups, AMI and so on.
    3. Autoscaling group are provided as a service from AWS. No maintenance at all

    View Slide

  33. © 2018 InfluxData. All rights reserved.
    33 @gianarb - [email protected]
    Kubernetes cluster in AWS

    View Slide

  34. © 2018 InfluxData. All rights reserved.
    34 @gianarb - [email protected]
    Kubernetes cluster in AWS / Master
    1. The master is manually provisioned
    2. We use kubernetes cronjob to backup the etcd cluster
    3. We are practical folks. We know we can do better but for now we are happy with what
    we have for now

    View Slide

  35. UserData and immutable
    infrastructure reduces the need
    for a config management tool
    like Ansible or Chef.

    View Slide

  36. Spit your provisioning by layers
    using Packer or LinuxKit to have
    a base image and UserData to
    specialize it

    View Slide

  37. © 2018 InfluxData. All rights reserved.
    37 @gianarb - [email protected]
    CoreOS supports ignition (ok this is JSON.. but it is not too much JSON!)
    {
    "ignition": {"version": "2.1.0"},
    "storage": {
    "directories": [{"filesystem": "root", "path": "/etc/kubernetes/manifests"},
    "files": [{
    "filesystem": "root",
    "path": "/etc/ssl/AmazonRootCA1.pem",
    "contents": {"source": "https://www.amazontrust.com/repository/AmazonRootCA1.pem"}
    }]
    },
    "systemd": { … }
    }

    View Slide

  38. Shorter is the ttl for a single
    server and more secure it is

    View Slide

  39. © 2018 InfluxData. All rights reserved.
    39 @gianarb - [email protected]
    Wrap up
    ¨ JSON and YAML are not the problem
    ¨ Long TTL vs Short TTL
    ¨ Use the API!
    ¨ Patterns and methodology to write good
    infra as code
    ¨ Testing to enforce safety
    ¨ Inline documentation
    ¨ Code Review
    ¨ Immutable Infrastructure

    View Slide

  40. © 2018 InfluxData. All rights reserved.
    40 @gianarb - [email protected]
    Credits and links
    ¨ https://gianarb.it/blog/infrastructure-as-real-code
    ¨ https://twitter.com/danveloper/status/1078870433246662656
    ¨ https://blog.couchbase.com/kubernetes-operators-game-changer/
    ¨ https://gianarb.it/blog/reactive-planning-is-a-cloud-native-pattern
    ¨ https://engineering.bitnami.com/articles/a-deep-dive-into-kubernetes-controllers.html

    View Slide

  41. © 2018 InfluxData. All rights reserved.
    41 @gianarb - [email protected]
    Reach out:
    @gianarb
    [email protected]
    https://gianarb.it
    Any question?

    View Slide