Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Building Elastic Infrastructures

Building Elastic Infrastructures

The state of auto scaling infrastructure is still in its infancy and information, open source tools and best practices to automatically scale infra and applications on a the cloud is scarce.

Flipkart.com has a private cloud spread over two datacenters, thousands of servers, many environments. 350+ developers. Project based environments are created and destroyed automatically on demand, utilizing Infrastructure services viz Mail, DNS, Monitoring, Load Balancing, Backup and application services like Mysql, RabbitMQ, Couchbase. Software is developed and deployed using continuous integration.

How do we configure DNS automatically? How do we create Mysql clusters with master and slaves on demand? How do we make sure that all virtual machines come with monitoring and alerting by default? How are log files purged and backed up on their own? How do we make sure that your weekend hack is ready to perform at web scale; when the whole world discovers it at the same time? All these are important questions where automatic scaling, elastic infrastructure is concerned. I will answer all of the above.

My aim is to describe the current best practices and software to use when creating an auto scaling infrastructure. I showcase an elastic infrastructure that is based on proven methods and open source software (Flipkart Hostdb and Puppet) which enables us to build a platform that inturn allows application engineers to create massively scalable web apps without losing sleep.

The lessons we’ve learned demonstrate that a combination of state management and configuration management systems is effective in building and maintaining an infrastructure and applications that are elastically scalable. The ideas discussed here can be applied to a private cloud or a public cloud like AWS.

Pankaj Kaushal

November 11, 2013
Tweet

More Decks by Pankaj Kaushal

Other Decks in Technology

Transcript

  1. Flipkart.com • Millions of users. ~4000 req/sec! • Rapid continuous

    change! • One click build and deployment (app +infra)! • 50 -100 releases a day. Production push is a non event! • 350+ dev collaborating
  2. Agenda • Why Build Automated Infrastructure?! • How to scale

    Infrastructure and Applications alike?! • Announcing: Something New! • Deep Customization vs Generalized Tools
  3. History • One ops guy! • Cushioned living vs Startup!

    • Constantly struggle with hardware allocation! • Disastrous re-clone of existing hardware! • Forget monitoring ... regret later
  4. Challenges • One click VM! • How do we configure

    DNS automatically?! • Automated ! • Mysql Clusters! • monitoring and alerting! • log aggregation and purge! • scaling of applications
  5. Early Learnings/wins • Standardize hardware! • Remote Install + management!

    • Virtualization ! • Package all software ! • Manage configurations in Puppet ! • Centralized host database
  6. Some gems • Site ops pulls the wrong wire out

    ! • Noob Ops guy reboots the user database! • Devs allocate hardware which is not used for a year
  7. VM1 VM2 physical box hostDB API HostDB agent Monitoring service

    Node info Node classes agent agent Backup service agent DNS service RabbitMQ Unbound MySQL CouchBase create host entries Kloud agent Kloud create host get puppet info get host info write read Puppet server Node info Node classes agent VM
  8. A Host Database • Details all existing hosts ! •

    Store information about a host! • Ability to tag hosts according to purpose! • Single source of truth!
  9. Some gems • Site ops, dev, network, all on the

    same page, everyone uses hostname to communicate! • Automated tools find hosts that are idle
  10. Another Simple idea • Keep the truth in HostDB! •

    API for applications to interact! • Single source of truth! • De-centralized agents take actions
  11. Decentralized Agents Agent to:! • choose a Physical box! •

    create a VM on a physical host! • update Puppet ! • add DNS records ! • add automated monitoring! • <custom agent>
  12. VM1 VM2 physical box hostDB API HostDB agent Monitoring service

    Node info Node classes agent agent Backup service agent DNS service RabbitMQ Unbound MySQL CouchBase create host entries Kloud agent Kloud create host get puppet info get host info write read Puppet server Node info Node classes agent VM
  13. lifecycle • Single source of truth! • Define the lifecycle

    of a host in HostDB! • Write agents to solve just about any problem
  14. HostDB • Highly Available and Reliable ! • Namespaces and

    Access controls ! • Rollbacks/Git as backend! • Rest API for applications y interact
  15. HostDB • Web Application! • Rest API! • Command Line

    Interface! • HostDB::Client Perl Module
  16. Puppet • Manage Configurations! • Create Different Environments! • base

    environment as idempotent ! • app environment for infra/apps
  17. Host DB as ENC • Use External Node Configuration! •

    Information from HostDB is imported into Puppet! • ENC config:! • node_terminus = exec! • external_nodes = /usr/local/bin/foo
  18. VM • VM is created with a HostDB entry! •

    Cloud Agent creates a host! • Puppet ENC creates a node entry! • Startup tasks on the host run Puppet! • Puppet run configures the hosts
  19. DNS • All members of a zone are tagged !

    • DNS agent creates a zone file periodically
  20. Backups • All hosts are tagged with there own specific

    backup tag! • Backup agent looks at the tag and decides action! • Puppet adds backup packages to the host
  21. Monitoring • Monitoring tags have details! • Agents create config

    files for Nagios and other monitoring tools ! • Puppet knows what monitoring packages to install for a node
  22. Elacstic DNS • Hosts point to a VIP ! •

    define minimal and maximum VM thresholds for DNS hostgroup in Kloud! • Unbound hosts created on demand! • LoadBalancer Agent reads HostDB tags and adds machines to LB on demand
  23. Elastic Databases • Hosts point to a VIP ! •

    Mysql HA monitors Databases! • Failover on Failure! • New DB Hosts ! • Replication
  24. Announce: HostDB • Released under the Apache license ! •

    github.com/Flipkart/HostDB! • We’re committed to maintain it ! • Works as a great ENC for puppet
  25. O /|\ • Conviction: Deep Customization! • Not preaching a

    way of doing things! • Pre mco/factor/foreman