Lessons learned migrating 400 servers from Rackspace public cloud to a hosted OpenStack private cloud. Presented by Chris Snell at the Seattle DevOps Meetup Group.
Rackspace Public Cloud …using about 800 GB of RAM in aggregate …and a bunch of Cloud Databases (DBaaS) instances …and a dozen Cloud Load Balancers (LBaaS) …and growing 20% month-over-month
more maintaining complex IPtables rulesets Dom0 threat exposure greatly reduced IPsec VPN for errr’body! Dedicated, HA firewalls and load balancers in front of all instances
Ubuntu 14.04 LTS. Added a run-once /etc/rc.local to do first-boot provisioning: • Set up apt to use RAX mirrors • Set up /etc/hosts • Enable OpenStack ohai hints • Alert us via HipChat if anything goes wrong
some infrastructure: DNS - Authoritative DNS servers (DJB’s djbdns) x 2 - DNS caches (DJB’s dnscache) x 2 Logging - rsyslog - forwarding to Papertrail Chef - Open Source Chef server SMTP - Postfix forwarders (Sendgrid, Mailgun, etc)
a fresh Chef repo. • Set up our own Chef Open Source server • Built cookbooks as we moved each service • Gangnam-style cookbook structure • Base role to install common components
the hardest to move. Use replication to move the data. One cluster at a time. MYSQL MASTER MYSQL SLAVE MYSQL SLAVE PRIVATE CLOUD PUBLIC CLOUD MYSQL MASTER MYSQL SLAVE STEP 1 STEP 2
at a time - recover to “green” cluster state after each move Designate master-only (no data) nodes Use cluster.routing.allocation.exclude._ip to exclude public cloud nodes when decommissioning public cloud nodes and turning on new private cloud nodes
• Inaccurate metrics displayed in Horizon • No historical trends/graphs • No cluster-wide metrics • Useless for capacity planning • No alerting • Ceilometer API is broken (in Havana, anyway…)
(production) instance - Many out-of-the-box app integrations - Custom checks written with Python (we have about 10 so far) - Very powerful graph builder - Cons: it’s expensive