Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Automation Principles at Helpshift

Automation Principles at Helpshift

Everyone knows automation in operations is necessary and is a key objective of any DevOps practice. But if not started on the right foot, or with the right objective in mind, can often leads to false starts and bad design choices. Building and extending to incorporate new requirements gets increasingly difficult leading to hacks and rewrites.

At Helpshift we follow certain principles and guidelines that have helped us avoid these pitfalls, and allow us to build a stable and extensible automations framework using Ansible.

I will walk through these principles, the thought process behind them, and the results through demos.


Raghu Udiyar

November 23, 2017


  1. Automation Principles at Helpshift Raghu Udiyar Production Engineering Manager @

  2. 75k+ RPS 600M+ MAU 5+ releases / day 500 GiB

    / day
  3. Ⅰ - Workflows High level objective with all its logic

    in one place End to End Automation, avoiding Piecemeal solutions
  4. E.g. Provision Instance Workflow $ ansible-playbook provision.yml -e “cli_cloud=aws cli_cluster_name=mongocl01

    cli_count=6 cli_instance_type=r3.xlarge... Single Interface encapsulates all logic of provisioning like instance configuration, monitoring, OS configuration, etc
  5. E.g. Release Software $ release.sh <service_name> <service_tag> <hosts>... Release a

    service while taking care dependencies, service configuration, health-checks, etc
  6. E.g. Setup Kafka Cluster $ ansible-playbook kafka_cluster.yml -e 'cli_cluster_name=kafkacl01 cli_count=6

    cli_zookper=cl_zk… • Provisioning • Configuration • Monitoring • Backups
  7. Demo: Redis Cluster Workflow

  8. Ⅱ - Explicit Control Flow • Push vs Pull Model

    : We follow push model ◦ E.g: AMI or agent based configuration • Avoid Inversion of Control • Simple, Easy to reason and maintain • Better error handling
  9. Ⅲ - Idempotent Workflows Idempotency at all levels, i.e. at

    Workflow level Easy to reason, and well defined behaviour
  10. Idempotency with Redis cluster ▪ If cluster already exists, and

    configured, won't do any side-effects if re-run ▪ Change cli_slave_count=2 results in unsurprising, and well defined behaviour
  11. Ⅳ- Think like a Programmer • Implement consistent Interfaces for

    workflows • Abstract common tasks together (e.g. provisioning) • Encapsulate all logic inside the respective workflows
  12. Redis Cluster abstractions • redis_cluster.yml ◦ provision.yml ▪ Provision instance,

    Users, Monitoring, Disks etc ◦ redis role ▪ Configuration, redis_type (master / slave)
  13. Ansible + Workflows Infrastructure, Orchestration and Configuration management

  14. Ansible Self Service via Jenkins • Dev test machines •

    Sandboxes • Prod releases Systems Automation • Autoscaling • OS configuration • Security Backend Services Postgresql, Kafka, Mongodb, Nginx, Haproxy, redis, etc
  15. In Summary 1. Automate with workflows in mind 2. Explicit

    Control Flow 3. Idempotency all the way 4. Think like a programmer
  16. Thank You