Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Elastic Cloud Inside Out

Elastic Co
March 08, 2017

Elastic Cloud Inside Out

Elastic Cloud runs thousands of clusters and is growing rapidly, while maintaining solid SLAs and allowing users to scale out, upgrade, and reliably monitor their clusters. Ever wonder how it works?

This session will go under the hood into how Elastic Cloud is built, how it handles complex tasks like efficient placement, zero-downtime upgrades, and host failures, as well as how the Elastic Cloud team uses it to orchestrate and manage thousands of clusters for Elastic Cloud customers.

Uri Cohen l Sr. Director, Product Management l Elastic
Ben Osborne l Site Reliability Engineer l Elastic
Christian Strzadala l Software Engineer l Elastic

Elastic Co

March 08, 2017
Tweet

More Decks by Elastic Co

Other Decks in Technology

Transcript

  1. Elastic March 8, 2017 Elastic Cloud Inside Out Ben Osborne

    @bj_osborne Christian Strzadala @cstrzadala Uri Cohen @uri1803
  2. Agenda 2 1 Elastic Cloud, 2 years in 2 What

    it looks like under the hood 3 What it takes to keep the lights on for thousands of clusters 4 Elastic Cloud as a Product!
  3. Brief History Found Beta Found GA Elastic acquires Found Found

    Gold & Platinum Elasticsearch 2.0 available on Found Say Heya to Elastic Cloud Elasticsearch & Kibana 5.0 on Cloud 2012 2013 2015 2016 2017 Elastic Cloud Enterprise Public Beta
  4. Why Elastic Cloud? • Fully managed • Super easy scaling

    & upgrades • Runs on top notch instances • Always the latest and greatest • X-Pack! • Support right from the source • All the APIs are there
  5. • Service orientated architecture • Utilise AWS • Immutable services

    • Docker for containerisation • Services created specifically for Elastic stack component deployment Key features
  6. Zookeeper Our beloved coordination system • Distributed, strongly consistent datastore

    • CaP • Fast reads for small pieces of data • Watchers
  7. What's the story (with a plan) morning glory? { "availability_zones":

    1, "capacity": 1024, "failover": {}, "plan_id": "887a397f-abc1-4896-8eca-e3c60d4ca10d", "strategy": {}, "source": { "action": "elasticsearch.create-cluster", "user_id": 3403111101, "date": "2017-03-06T22:22:46.654Z", "facilitator": "console", }, "elasticsearch": { "version": "5.2.2", "user_settings_overrides": null, "plugins": [], "user_plugins": [], "user_bundles": [], "user_settings": {} }, "ssd": true, "instance_capacity": 1024, "instance_count": 1, "region": "us-east-1" }
  8. • Our cluster scheduler. • Calculates what needs to be

    changed when a cluster is added or reconfigured • Uses the concept of 'steps’ when processing a pending plan to progress the plan forward (or backward) The Constructor
  9. • Steps are the stages that need to complete for

    a cluster to become available • Steps involve: • Querying Zookeeper data • Querying the cluster • Controlling allocation configuration, such as moving data away from a node it’s about to kill. • Rollbacks (Cluster changes can fail) Constructor Steps
  10. • Manages Elasticsearch and Kibana nodes • We use large

    AWS instances to house our services • We have a pool of allocators in each region • Each allocator advertises its resources in zookeeper • Each node that runs on an allocator gets its own docker container The Allocator
  11. • All services including Elasticsearch and Kibana are deployed as

    Docker containers • The history of the platform has moved from a shared cluster to clusters deployed with containers • Guarantee a share of resources • Malicious attacks are limited to a container Docker
  12. • We don't expose clusters directly, but route all requests

    through a proxy layer. • Maintains an in-memory routing table of Zookeeper data The Proxy
  13. Whatcha Monitoring There ? • Zookeeper • Proxies • Load

    Balancers • Docker Containers • Host machines • Heartbeat SLA
  14. Provisioning, orchestration, and management of multiple clusters Deployed on-premise, in

    your private cloud … or wherever you want Leverages the same technology used in Elastic Cloud Automates frequent tasks such as snapshot/restore, upgrade and scale Public beta available; GA expected Q2 2017
  15. Except where otherwise noted, this work is licensed under http://creativecommons.org/licenses/by-nd/4.0/

    Creative Commons and the double C in a circle are registered trademarks of Creative Commons in the United States and other countries. Third party marks and brands are the property of their respective holders. 45 Please attribute Elastic with a link to elastic.co