DevOpsTO - AWS and Ansible with theScore

Slide 1

Slide 1 text

AWS and Ansible with theScore #DevOpsTO Luke Reeves @lukereeves

Slide 2

Slide 2 text

What is theScore? Several applications - Sports, eSports and Fantasy Support across iOS, Android and web platforms Over ten million unique users and installations Billions of push notiﬁcations sent

Slide 3

Slide 3 text

All-in with AWS EC2 Autoscaling via AMIs ELB, sometimes RedShift, sometimes CloudFront RDS Elasticache S3 Cloudwatch Route 53 IAM CloudTrail

Slide 4

Slide 4 text

Ansible We’ve tried other CM tools such as Chef and Salt Procedural instruction order much easier to work with than functional Playbook format is very straightforward Having no master server by default gives us a lot of ﬂexibility All infrastructure can be described in a single repo and atomic changes are possible

Slide 5

Slide 5 text

Multiple Uses Provisioning hosts Installing application dependencies on hosts Deploying and updating applications Running commands across groups

Slide 6

Slide 6 text

Provisioning Role We have a “common” role that applies to all of our server images Other roles can always depend on the common role being run ﬁrst on any hosts Repeatability is a must - the common role should be able to run at any time to add new base packages, user SSH keys, etc.

Slide 7

Slide 7 text

Application Roles Applications are broken into environments, groups and roles The roles are ﬁne-grained and usually represent small to large clusters Example roles are “cms_loadbalancer”, “esports_api”

Slide 8

Slide 8 text

Autoscaling Overview Sports is primarily event based, so autoscaling each application cluster is a good ﬁt News can also have surges and is not as easy to predict With AWS you pay for exactly what resources you use, so tuning auto-scaling is crucial

Slide 9

Slide 9 text

Provisioning, ﬁrst try Create Ansible playbooks for various server roles and components (e.g. Rails server, memcached host) Use a server with scaling SNS hooks that applies the playbook to any new hosts that spin up. Post-creation the same server reconﬁgures any load balancers to add new hosts.

Slide 10

Slide 10 text

Provisioning, second try Use the same Ansible playbooks for each role. After any deployments (changes) to running instances create an AMI snapshot of the host. Update the scaling group for the changed role to point to the AMI. Use the EC2 APIs to conﬁgure load balancers on a schedule (every minute works ﬁne).

Slide 11

Slide 11 text

Provisioning Using the snapshot method allows us to bring up new EBS-backed hosts in an average of 30 seconds as opposed to 10+ minutes. Leveraging tags in the EC2 hosts and scaling groups lets us create load balancer conﬁgurations with minimal lag between host creation and registration for requests. Make sure you freeze scaling groups during the deployment and snapshot, and unfreeze them afterwards. The faster you can scale up and down, the more cost effective AWS is. This also applies to tightening scaling thresholds (CPU usage, etc).

Slide 12

Slide 12 text

Gaps Examples of wiring up load balancers… don’t work in practice Some ﬁghts with the template system (Jinja 2) - it’s very simple and you can’t just outright embed Python code When we started two years ago we had to build out AWS description plugins We curse “bundle AMI” but don’t have a good replacement :-)

Slide 13

Slide 13 text

Next Steps Using consul and consul-template for conﬁg data propagation CircleCI running all deployments after the tests pass - why waste our time running the command? Putting all application-relevant conﬁguration (Ruby version for example) in the app’s own GitHub repos.

Slide 14

Slide 14 text

Conclusion Ansible is a great tool; there are deﬁnitely functions we would use other tools for but it still is a great base Integrating Ansible and AWS has reduced our work (and drama) hugely Questions? Luke Reeves @lukereeves