Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Building Elastic Infrastructures

Building Elastic Infrastructures

The state of auto scaling infrastructure is still in its infancy and information, open source tools and best practices to automatically scale infra and applications on a the cloud is scarce.

Flipkart.com has a private cloud spread over two datacenters, thousands of servers, many environments. 350+ developers. Project based environments are created and destroyed automatically on demand, utilizing Infrastructure services viz Mail, DNS, Monitoring, Load Balancing, Backup and application services like Mysql, RabbitMQ, Couchbase. Software is developed and deployed using continuous integration.

How do we configure DNS automatically? How do we create Mysql clusters with master and slaves on demand? How do we make sure that all virtual machines come with monitoring and alerting by default? How are log files purged and backed up on their own? How do we make sure that your weekend hack is ready to perform at web scale; when the whole world discovers it at the same time? All these are important questions where automatic scaling, elastic infrastructure is concerned. I will answer all of the above.

My aim is to describe the current best practices and software to use when creating an auto scaling infrastructure. I showcase an elastic infrastructure that is based on proven methods and open source software (Flipkart Hostdb and Puppet) which enables us to build a platform that inturn allows application engineers to create massively scalable web apps without losing sleep.

The lessons we’ve learned demonstrate that a combination of state management and configuration management systems is effective in building and maintaining an infrastructure and applications that are elastically scalable. The ideas discussed here can be applied to a private cloud or a public cloud like AWS.

Pankaj Kaushal

November 11, 2013
Tweet

More Decks by Pankaj Kaushal

Other Decks in Technology

Transcript

  1. Elastic Infrastructures
    Pankaj Kaushal

    flipkart.com

    View Slide

  2. whoami
    • Datacenter Architect at Flipkart!
    • #accident-prone #Photography
    enthusiast !
    • @spo0nman on the interwebs

    View Slide

  3. Flipkart.com
    • Millions of users. ~4000 req/sec!
    • Rapid continuous change!
    • One click build and deployment (app
    +infra)!
    • 50 -100 releases a day. Production push
    is a non event!
    • 350+ dev collaborating

    View Slide

  4. Agenda
    • Why Build Automated Infrastructure?!
    • How to scale Infrastructure and
    Applications alike?!
    • Announcing: Something New!
    • Deep Customization vs Generalized
    Tools

    View Slide

  5. The Problem

    View Slide

  6. History
    • One ops guy!
    • Cushioned living vs Startup!
    • Constantly struggle with hardware
    allocation!
    • Disastrous re-clone of existing hardware!
    • Forget monitoring ... regret later

    View Slide

  7. Challenges
    • One click VM!
    • How do we configure DNS automatically?!
    • Automated !
    • Mysql Clusters!
    • monitoring and alerting!
    • log aggregation and purge!
    • scaling of applications

    View Slide

  8. Early Learnings/wins
    • Standardize hardware!
    • Remote Install + management!
    • Virtualization !
    • Package all software !
    • Manage configurations in Puppet !
    • Centralized host database

    View Slide

  9. Some gems
    • Site ops pulls the wrong wire out !
    • Noob Ops guy reboots the user
    database!
    • Devs allocate hardware which is not
    used for a year

    View Slide

  10. The Solution

    View Slide

  11. VM1 VM2
    physical box
    hostDB
    API
    HostDB
    agent
    Monitoring service
    Node info
    Node classes
    agent agent
    Backup service
    agent
    DNS service
    RabbitMQ
    Unbound
    MySQL
    CouchBase
    create host

    entries
    Kloud
    agent
    Kloud
    create host
    get puppet info
    get host info
    write
    read
    Puppet server
    Node info
    Node classes
    agent
    VM

    View Slide

  12. A Host Database
    • Details all existing hosts !
    • Store information about a host!
    • Ability to tag hosts according to
    purpose!
    • Single source of truth!

    View Slide

  13. single source of truth

    View Slide

  14. a virtual host

    View Slide

  15. Some gems
    • Site ops, dev, network, all on the same
    page, everyone uses hostname to
    communicate!
    • Automated tools find hosts that are idle

    View Slide

  16. Another Simple idea
    • Keep the truth in HostDB!
    • API for applications to interact!
    • Single source of truth!
    • De-centralized agents take actions

    View Slide

  17. Decentralized Agents
    Agent to:!
    • choose a Physical box!
    • create a VM on a physical host!
    • update Puppet !
    • add DNS records !
    • add automated monitoring!

    View Slide

  18. VM1 VM2
    physical box
    hostDB
    API
    HostDB
    agent
    Monitoring service
    Node info
    Node classes
    agent agent
    Backup service
    agent
    DNS service
    RabbitMQ
    Unbound
    MySQL
    CouchBase
    create host

    entries
    Kloud
    agent
    Kloud
    create host
    get puppet info
    get host info
    write
    read
    Puppet server
    Node info
    Node classes
    agent
    VM

    View Slide

  19. Simple yet powerful

    View Slide

  20. lifecycle
    • Single source of truth!
    • Define the lifecycle of a host in HostDB!
    • Write agents to solve just about any
    problem

    View Slide

  21. The Implementation

    View Slide

  22. Puppet + HostDB

    View Slide

  23. HostDB
    • Highly Available and Reliable !
    • Namespaces and Access controls !
    • Rollbacks/Git as backend!
    • Rest API for applications y interact

    View Slide

  24. HostDB
    • Web Application!
    • Rest API!
    • Command Line Interface!
    • HostDB::Client Perl Module

    View Slide

  25. Puppet
    • Manage Configurations!
    • Create Different Environments!
    • base environment as idempotent !
    • app environment for infra/apps

    View Slide

  26. Host DB as ENC

    • Use External Node Configuration!
    • Information from HostDB is imported
    into Puppet!
    • ENC config:!
    • node_terminus = exec!
    • external_nodes = /usr/local/bin/foo

    View Slide

  27. VM
    • VM is created with a HostDB entry!
    • Cloud Agent creates a host!
    • Puppet ENC creates a node entry!
    • Startup tasks on the host run Puppet!
    • Puppet run configures the hosts

    View Slide

  28. DNS
    • All members of a zone are tagged !
    • DNS agent creates a zone file
    periodically

    View Slide

  29. a virtual host

    View Slide

  30. Backups
    • All hosts are tagged with there own
    specific backup tag!
    • Backup agent looks at the tag and
    decides action!
    • Puppet adds backup packages to the
    host

    View Slide

  31. backup

    View Slide

  32. Monitoring
    • Monitoring tags have details!
    • Agents create config files for Nagios
    and other monitoring tools !
    • Puppet knows what monitoring
    packages to install for a node

    View Slide

  33. a virtual host

    View Slide

  34. Scaling Apps
    • Application Specific keys in HostDB!
    • Puppet classes define tasks

    View Slide

  35. Autoscale Apps
    Warehouse and Shipping

    View Slide

  36. Autoscale Unbound

    View Slide

  37. app specific defination

    View Slide

  38. Elacstic DNS
    • Hosts point to a VIP !
    • define minimal and maximum VM
    thresholds for DNS hostgroup in Kloud!
    • Unbound hosts created on demand!
    • LoadBalancer Agent reads HostDB tags
    and adds machines to LB on demand

    View Slide

  39. Elastic Databases
    • Hosts point to a VIP !
    • Mysql HA monitors Databases!
    • Failover on Failure!
    • New DB Hosts !
    • Replication

    View Slide

  40. Announce: HostDB
    • Released under the Apache license !
    • github.com/Flipkart/HostDB!
    • We’re committed to maintain it !
    • Works as a great ENC for puppet

    View Slide

  41. O
    /|\
    • Conviction: Deep Customization!
    • Not preaching a way of doing things!
    • Pre mco/factor/foreman

    View Slide

  42. Thank you
    Pankaj Kaushal - Flipkart.com!
    @spo0nman @flipkart_tech

    View Slide