Slide 1

Slide 1 text

How edX uses Ansible http://github.com/edx/configuration open-source configuration management for an open-source learning platform

Slide 2

Slide 2 text

What is edX? ● A website for learning ● A website for creating courses ● A platform for research and analytics ● A community of learners and developers creating courses and extending the edX open source platform

Slide 3

Slide 3 text

Taking a course as a student ● Interactives ● Inline discussions ● Peer assessment ● Code grading

Slide 4

Slide 4 text

Creating a course as an instructor ● GUI for creating different problems and publishing courses ● XML export and import ● Instructor view for grade reports and viewing student progress

Slide 5

Slide 5 text

Interacting with students and instructors ● General purpose forums for every course ● Inline discussions on each problem

Slide 6

Slide 6 text

Try out the platform! ● Point your browser to https://www.sandbox. edx.org ● Login as [email protected] / edx ● This public sandbox runs on a single EC2 server that is recreated weekly ● One of the many examples of Ansible automation @ edX

Slide 7

Slide 7 text

edX and Ansible ● ~ 50 custom Ansible roles ● ~ 700 Ansible tasks ● ~ 150 forks of our configuration repo ● edX Ansible plays are run every day at edX and around the globe to provision and deploy the platform

Slide 8

Slide 8 text

Why Ansible? ● edX started out with puppet and fabric ○ Even the most simple tasks felt like we were constantly fighting the tool ○ We wanted one tool for code updates and base OS configuration While we were having issues with Puppet Stanford was adopting the edX platform and having similar issues with Chef. edX started using Ansible (1.3) in September, 2013

Slide 9

Slide 9 text

Roles, plays and vars - edX conventions ● Ansible plays are short, simple and often limited to calling a single role ● Ansible roles are self-contained ○ Role name prefix for variable namespace ○ Defaults defined in the role, overridden using extra vars.

Slide 10

Slide 10 text

Updating code and configuration ● Tasks that update application code are put in a single deploy.yml task file ● The deploy task file is included with tags, - include deploy.yml tags: deploy

Slide 11

Slide 11 text

Ansible variables ● Role variables are defined in defaults/main. yml ● Vars that are overridden are typed in ALL CAPS ● Internal role variables are lowercase ● Defaults assume a single server localhost installation

Slide 12

Slide 12 text

edX operational requirements ● Universities that run edX want control over the tools and the data ● Isolate deployments, student privacy is extremely important ● Operates in the cloud but strives to be cloud agnostic ● Official releases of our configuration scripts with Vagrant images and public AMIs

Slide 13

Slide 13 text

edX uses three different types of installations ● Devstack - Using Vagrant, code directories are exported from the host filesystem to the guest ● Fullstack - Every service in production installed on a single server ● A small number of multi-cluster redundant configurations (stage, production, loadtest)

Slide 14

Slide 14 text

Devstack - optimized for development ● curl the Vagrantfile ● `vagrant up` ● Run the dev server ● See changes instantly on the guest VM ● Develop locally on the host using an IDE

Slide 15

Slide 15 text

Fullstack - All edX services ● Single server configured the same as production ● One-click single instance provisioning for launching sandboxes ● Ansible handles creation, termination and config ● Official vagrant images and public AMIs

Slide 16

Slide 16 text

Nginx with basic auth OS Services ● MySQL server ● Mongo server ● RabbitMQ ● ElasticSearch ● Memcache edX services (supervisor) ● student courseware (gunicorn/django) ● course authoring (gunicorn/django) ● workers (django) ● forum (sinatra) ● xserver (gunicorn/wsgi) ● certificates (python) Single self contained edX installation on a single Ubuntu server - edxapp - forums - xserver - certs sandbox.yml

Slide 17

Slide 17 text

Installing edX in production ● git checkouts instead of packages ● python virtual environments for every service ● pre-built images for AWS auto-scaling ● AWS virtual private clouds used for multiple edX installations ● VPC layout described using cloudformation ● Utilizing AWS resources where possible (RDS, ELB, S3, Elasticcache, SES)

Slide 18

Slide 18 text

● Every cluster has a corresponding Ansible play for configuration ● A single ops VPC (lower left) connects to the other VPCs for running config updates and creating deploy images

Slide 19

Slide 19 text

Configuration repos for multiple installations ● A single generic open source repo contains all of the edX roles ● A private repo contains roles internal to edX ● A private repo containing one yaml file per environment for variable overrides ● For code version updates, vars are set on the command-line

Slide 20

Slide 20 text

How edX uses Ansible for server administration ● Jenkins server that connects to other VPCs and runs ansible plays. ● Inventory is fed directly into Jenkins via an inventory script that gathers tag information ● All Ansible tasks are run from Jenkins for auditing and repeatability

Slide 21

Slide 21 text

Ansible Inventory in a Jenkins dropdown Feed the ouput of ec2.py into Jenkins with the Dynamic Choice Paramater plugin and a simple groovy script.

Slide 22

Slide 22 text

Selecting inventory in Jenkins ● Using a special first_in prefix to ec2.py group names so tasks can be run on one server in a tag group ● Use serial with pre and post tasks for rolling updates group names returned by ec2.py

Slide 23

Slide 23 text

Running Ansible from Jenkins Plays are selected from a drop-down and run on any cluster or individual server returned by the inventory script

Slide 24

Slide 24 text

Image based deployments ● Using images and autoscaling makes it easier to scale ● Ansible was the perfect tool to make this work with very little additional effort ● Netflix’s Asgard has proven extremely useful for managing autoscaling groups and, cut overs and rollbacks.

Slide 25

Slide 25 text

Creating an AMI with Jenkins and Ansible ● A python script launches an instance and runs ansible on bringup in user-data ● Progress is sent back to the user via sqs using a custom Ansible callback plugin

Slide 26

Slide 26 text

The callback plugin allows for our own custom Ansible status output After every Ansible run the longest run tasks are displayed with a time summary

Slide 27

Slide 27 text

Releasing with Asgard Servers are built with the AMIs using Netflix’s Asgard. Traffic is sent to the new servers, monitored and the old cluster is removed.

Slide 28

Slide 28 text

What’s next for edX and Ansible? ● Initiate deployments from hipchat ● Moving cloudformation provisioning to Ansible using VPC modules ● Productizing provisioning in other cloud environments ● Automating canary workflows

Slide 29

Slide 29 text

THANKS!! ● Questions? ● For more information ○ jarv on freenode (#edx-code and #ansible) ○ The edx-ops email list ○ edX configuration repo ○ Stand up your own edX website!