$30 off During Our Annual Pro Sale. View Details »

SysadminDays - Nestor

Saâd Dif
November 19, 2019

SysadminDays - Nestor

Déploiement et gestion du cycle de vie de micro-services sur Kubernetes avec un outil maison

Kapten's tech team is looking for a tool to deploy and manage their microservices. The quest to find a sharp tool looked gloomy at that time without any existing tool answering all the pressing needs for our knights. Knights of the round table gathered and embraced the challenge to give birth to Nestor. One tool to rule them all!

Nestor is our valiant tool allowing every soul in the tech kingdom to interact with microservices without the need to be a container or orchestration wizard.

Our fearless tech team uses it daily, its goal is simple yet complex: Manage, configure and deploy our audacious microservices stack, which is growing at the speed of light.

Saâd Dif

November 19, 2019
Tweet

Other Decks in Programming

Transcript

  1. Nestor
    A tool to rule ‘em all

    View Slide

  2. About me_
    2
    Saâd Dif
    SRE
    [email protected]
    eng.kapten.com
    @Kapten_tech
    @kapten-engineering
    ChauffeurPrive

    View Slide

  3. Agenda_
    3
    Kingdom of
    Kapten_
    Once upon a
    time_
    Uh oh_
    Nestor_
    They lived
    happily_

    View Slide

  4. Kingdom of Kapten_

    View Slide

  5. Kapten_History
    5
    Passenger_ Driver_

    View Slide

  6. Kapten_History
    6

    View Slide

  7. 2016
    180
    160
    140
    120
    100
    80
    60
    40
    20
    0
    2017 2018
    48,6
    100
    160
    Revenue 2016 – 2017 – 2018
    Kapten_Growth
    7
    3 millions
    customers
    400
    collaborators
    50 000
    partner drivers
    3000
    client companies
    french ride hailing leader
    +60%

    View Slide

  8. Kapten_Production
    8
    1TB
    storage
    60k
    metrics
    15k
    inc / min
    100TB
    storage
    780k
    logs / min
    50B
    lines of log
    100+
    Nodes
    130+
    µ-services
    20+
    deploy / day
    1k+
    processes

    View Slide

  9. Agenda_
    9
    Kingdom of
    Kapten_
    Once upon a
    time_
    Uh oh_
    Nestor_
    They lived
    happily_

    View Slide

  10. Heroku_
    10
    ● Why ?
    - PaaS over IaaS
    - Easy to handle and scale for dev
    - Ready to use
    - AutoBuild for various languages
    - Everything as a Service
    ● Configuration management ?
    - Git repositories letsdeploy/letsdeploy-config
    - Hand provisioned servers
    ● Deployments ?
    - Letsdeploy

    View Slide

  11. Letsdeploy_
    11
    ● Python scripts
    ● Release management
    ● Configuration management
    ● Scaling

    View Slide

  12. Letsdeploy-config_
    12
    ● One file per micro service
    ● Three sections for each configuration file
    {
    "source": "[email protected]:transcovo/nestor-charybde.git",
    "target": {
    "heroku_app_name": "prod-nestor-charybde-cp-eu",
    "type": "heroku"
    },
    "variables": {
    "AWS_ACCESS_KEY_ID": "*****",
    "AWS_SECRET_ACCESS_KEY": "*****",
    "LOGGER_LEVEL": "info",
    "LOGGER_NAME": "production.nestor-charybde",
    "NODE_ENV": "production",
    "NPM_TOKEN": "*****",
    "SLACK_TOKEN": "*****"
    }
    }

    View Slide

  13. Nestor_
    13
    Workflow management
    ● Master
    ● Greenlight
    ● Non-prod environments
    ● Production

    View Slide

  14. Nestor_
    14
    Workflow management

    View Slide

  15. Agenda_
    15
    Kingdom of
    Kapten_
    Once upon a
    time_
    Uh oh_
    Nestor_
    They lived
    happily_

    View Slide

  16. Get out of Heroku_
    16
    ● Ending contract

    View Slide

  17. Get out of Heroku_
    17
    ● Ending contract
    ● Move to Kubernetes

    View Slide

  18. Get out of Heroku_
    18
    ● Ending contract
    ● Move to Kubernetes
    ○ Scalability
    ○ Costs
    ○ Real infra
    ○ Security

    View Slide

  19. What we need_
    19
    ● Simple and easy deployment tool
    ○ Command wrapper
    ○ Same tool for every environments
    ○ End users: Our developers

    View Slide

  20. What we need_
    20
    12 Factors (12Factor.net)
    ● Build, release, run
    ● Processes
    ● Port Binding
    ● Concurrency
    ● Disposability
    ● Dev/Prod parity
    ● Logs
    ● Admin processes
    ● Codebase
    ● Dependencies
    ● Config
    ● Backing services

    View Slide

  21. What we need_
    21
    12Factor.net
    ● Our microservices should respect those principles
    ● The tool should help us doing so

    View Slide

  22. What we need_
    22
    ● Usable by everybody

    View Slide

  23. The clock is ticking_
    23
    ● Time constraint
    ● Need to scale fast
    ● Ansible / Terraform

    View Slide

  24. Agenda_
    24
    Kingdom of
    Kapten_
    Once upon a
    time_
    Uh oh_
    Nestor_
    They lived
    happily_

    View Slide

  25. Nestor_
    25
    ● Rewrite existing tool
    ● Knowledge already present in all teams
    ● First usecase: dev environments

    View Slide

  26. Nestor_
    26
    ● Able to deploy on different platforms

    View Slide

  27. Nestor_
    27
    Three key components:
    ● CLI
    ● API
    ● CRON

    View Slide

  28. Nestor_
    28
    CLI:
    ● Workflow management
    ● Release related commands
    ● Configuration
    ● Datastore management

    View Slide

  29. Nestor_
    29
    Wrapper around kubectl:
    ● Port forward
    ● Switch environments
    ● ...

    View Slide

  30. Nestor_
    30
    API:
    ● NodeJS
    ● Manage workflow
    ● Triggered by CI to initiate builds
    ● Push images to DockerHUB

    View Slide

  31. Nestor_
    31
    API:
    ● DockerFile
    ● ProcFile
    ● CronFile

    View Slide

  32. Nestor_
    32
    API:
    ● Called by Rundeck
    ● One build per code delivery

    View Slide

  33. Nestor_
    33
    CRON:
    ● All environments updated from “staging” apps versions
    ● Releases on Staging every 30 minutes
    ● Datastore snapshot and reset of all dev environments
    ● Apps configuration from staging sync to dev environments

    View Slide

  34. Nestor_
    34
    Configuration built dynamically:
    ● Kubernetes templates
    ● “project.yaml” file for global configuration
    ● File for each micro service
    ● Children environments
    ● Merge of files

    View Slide

  35. Nestor_
    35
    Kubernetes templates:
    $ tree nestor-config/templates
    nestor-config/templates
    ├── anti-affinity-node.yaml
    ├── anti-affinity-zone.yaml
    ├── config-map.yaml
    ├── cronjob.yaml
    ├── deployment.yaml
    ├── hpa.yaml
    ├── ingress-app.yaml
    ├── ingress-global.yaml
    ├── ingress-nginx-global.yaml
    ├── job.yaml
    ├── namespace.yaml
    ├── nginx.conf
    ├── secret-tls.yaml
    └── service.yaml
    0 directories, 14 files
    $ cat nestor-config/templates/service.yaml
    apiVersion: v1
    kind: Service
    metadata:
    name: '{{name}}'
    labels:
    app: '{{app}}'
    spec:
    ports:
    - port: 80
    targetPort: {{target_port}}
    selector:
    app: '{{app}}'
    process: web
    type: ClusterIP

    View Slide

  36. Nestor_
    36
    project.yaml file:
    $ cat
    nestor-config/project.yaml
    project: kapten
    env: production
    domain: production.kapten.com
    docker:
    build:
    variables:
    NPM_TOKEN: $NPM_TOKEN
    registries:
    docker.com:
    - id: docker
    organization: kapten
    ...
    ...
    deployments:
    kubernetes:
    - cluster_name: kapten_production_eu-w1
    hpa_replicas: true
    scales:
    web:
    minReplicas: 3
    maxReplicas: 3
    resources:
    web:
    limits:
    memory: 256Mi
    cpu: 0.2
    nodeSelector:
    default:
    tier: app
    ...
    ...
    variables:
    ope:
    NODE_ENV: 'production'
    METRICS_DESTINATION: metrics.kapten.com
    slack:
    token: '**********'
    channels:
    info: tech-release-prod
    error: tech-release-prod
    ...

    View Slide

  37. Nestor_
    37
    App configuration file:
    app: mary
    git:
    origin: [email protected]:transcovo/mary.git
    is_enabled: true
    resources:
    web:
    limits:
    cpu: 0.2
    memory: 250Mi
    requests:
    cpu: 0.2
    memory: 250Mi
    scales:
    web:
    maxReplicas: 9
    minReplicas: 3
    targetCPUUtilizationPercentage: 75
    teams:
    - security
    variables:
    app: {}
    ope:
    SENTRY_DSN: "********"

    View Slide

  38. Nestor_
    38
    Monitoring and alerting in configuration files:
    ● Routing
    ● Threshold
    Config validation
    templateVars:
    tplCriticity: high
    tplTeam: security
    tplWeb2xxThreshold: "0"
    tplWeb50thLatencyThreshold: "0.30"
    tplWeb95thLatencyThreshold: "2"

    View Slide

  39. Nestor_
    39
    History management
    ● Nestor history
    ● History saved on specific repository
    ● Used for rollbacks

    View Slide

  40. Nestor_
    40
    Rollbacks:
    ● Apply previous yaml
    ● Use a specific commit id

    View Slide

  41. Nestor_
    41
    Most used features by developers:
    ● Deploy specific branch
    ● Port forwarding
    ● Switch between environments
    ● Datastore management

    View Slide

  42. From code...
    Workflow with Nestor_
    42
    e2e tests Load tests
    Unit tests
    Testing env. Monitoring To prod...
    ...in minutes with
    nestor_!

    View Slide

  43. Workflow_

    dev 1

    dev X
    .
    .
    .

    dev 2
    Master Staging
    GreenLight Production

    Peer
    Review
    Terminator Shadow
    CircleCI
    success
    (master
    branch)
    ● Create a git tag
    ● Build docker Image
    ● Rebase Greenlight
    branch from Master
    Legend
    ● Rebase
    Terminator,
    Staging and
    Shadow from
    Greenlight
    Nestor-api
    GL success
    ● Rebase Production
    from Shadow
    Rundeck
    Deploy
    Nestor-api
    Rebase and
    move git tag
    Nestor-api
    Call
    nestor-api

    View Slide

  44. Agenda_
    44
    Kingdom of
    Kapten_
    Once upon a
    time_
    Uh oh_
    Nestor_
    They lived
    happily_

    View Slide

  45. To sum up_
    45
    ● From PaaS to IaaS
    ● From deployment workflow to release and config management
    ● Migration took 4 months

    View Slide

  46. Thoughts_
    46
    ● We needed Nestor
    ● Making your own tool / wrapper is not a shame
    ● Answered a specific need at a specific time

    View Slide

  47. Next steps_
    47
    ● Container build
    ● CLI Rewrite
    ● ...

    View Slide

  48. Next steps_
    48
    Helm ?
    ● Community based
    ● No need to rewrite what is already here
    ● New V3
    ● Stable and reliable

    View Slide

  49. Next steps_
    49
    To study:
    ● Debug pod live (Telepresence / Monday)
    ● Make it available for everyone ?

    View Slide

  50. Thank you_!
    50
    eng.kapten.com
    @Kapten_tech
    @kapten-engineering
    ChauffeurPrive
    15€ of credit with the promo code SYSADMIN

    View Slide