Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Our journey to dynamic cloud infrastructure usi...

Our journey to dynamic cloud infrastructure using packer + terraform + ansible + jenkins - DevConf.in 2017

Mehul Ved

May 07, 2017
Tweet

Other Decks in Technology

Transcript

  1. Who are we? • Nexsales is a B2B Marketing and

    Sales Solution provider • VoiceReach – 2013 – First product • Rightleads – 2016 • VoiceReach API Platform - 2017
  2. The Beginning - Scenario • Static Infrastructure on Rackspace Cloud

    • 2 machines – Development Environment only • Internal Users, Low Volume, High Failure Rate • Tech Team – 2 developers, 1 VoIP engineer, 1 Product Manager
  3. The Beginning - Scenario • Manual Configuration by hand editing

    config files • Changes documented on wiki pages • Few shell scripts scattered around • Application updates directly via git push/pull • Deploys done at the completion of all features – often more than a month apart
  4. The Beginning - Problems • Documentation not always updated. •

    Rebuilding from scratch would have taken days. • Code deploys would be error prone. • No tracking of changes. • No dedicated Ops, ops tasks dependent only on me.
  5. Step 1 - Scenario • Development and Testing Environments –

    5 machines • From monolithic application to micro services. • Configuration and setup turned into a complex and unmaintainable maze. • Tried to unsuccessfully write shell scripts to configure the system.
  6. Step 1 - Scenario • Search for a solution •

    Learnt about ansible at rootconf • First set of external users • 2 week deploy cycle
  7. Step 1 - Achievements • Setup ansible scripts for Configuration

    Managment. • Greatly reduced the failure rate of changes. • One central place for recording changes. • Added simple deployment code to ansible. • Quicker and simpler deployments. • Parts of ops work could be delegated and automated.
  8. Step 1 - Problems • Difficulty in tracking deployments •

    Lack of consistency in deployments due as there were no reusable builds • Rebuilding the system would still take upto a day • Low reliability • Frequent code integration issues
  9. Step 2 - Scenario • First set of production releases

    • Dev, Staging and Production infrastructure – 10 machines • Releases every week, sometimes every 2-3 days. • I shift to full time ops role • 2 developers • Need for higher reliability
  10. Step 2 - Achievements • Introduction of project management tools

    – jira, confluence, bamboo, slack/hipchat • Failed attempt at introducing CI/CD • More organized releases • Consistent configuration across environments
  11. Step 2 - Problems • Build consistency across various environments

    • Tracking of deploys across environments • Difficulty in managing infrastructure • Lack of experience to put together a release and deploy pipeline
  12. Step 3 - Scenario • New hire for project management

    and QA • 3 developers – better focus and faster releases. • High number of git merge conflicts • Dabbling with ansible for dynamic infrastructure setup • Playing around with AWS and Gcloud with a view to cutting costs
  13. Step 3 - Achievements • Improved understanding of git and

    git workflows • Replacing bamboo with jenkins • Setup CI/CD pipeline and integration tests • Applying the learnings to the new project started in parallel • Time to build the infrastructure cut down to hours • Multiple daily builds in dev environment • Better tracking of releases to various environments
  14. Step 3 - Problems • Need for dynamic environment to

    cut down costs • Need for further cutting down the time to build new infrastucture • Complexity to write ansible code to manage infrastructure • Lack of full gcloud support in ansible(missing l7 load balancer module)
  15. Step 4 – Scenario • Multiple projects with multiple environments:

    10-30 machines at a time. Not everything used all the time. • Chance to start from scratch on new projects • Bigger team of 5 devs, parallel work on multiple projects
  16. Step 4 - Achievements • Introduction of terraform for provisioning

    infrastructure greatly reduced the complexity. • Adding packer builds for ready to use custom images. • Tying in of packer + terraform + ansible and running it via jenkins. • Very close to dynamic infrastructure(we need to work on quick and safe backup and retrival of data)
  17. Step 4 - Achievements • New infrastructure can be built

    in a few minutes. • Infrastructure has high level of automation • Automated releases to dev along with integration testing with abilitiy to merge branches and automatic release to testing and production once we are comfortable. • Ease of delegation and collaboration among the team
  18. Step 4 - Problems • Data and code still remain

    on same machines. Need to be separated out. • Infrastructure can’t be swapped out and replaced transparently. • Better handling of configuration required • Infrastructure Code isn’t reusable enough. Needs to be broken down into reusable modules.