Elastic Revolution

Elastic Revolution

A bit about the Elastic story and why Elastic is transforming the software industry.

A8c1cd7556870cf906064041cc5db121?s=128

Pablo Musa

October 29, 2016
Tweet

Transcript

  1. ‹#› Pablo Musa October 2016, @pablitomusa Elastic Revolution

  2. Pablo Musa • Engineer PUC-Rio • Master PUC-Rio • Backend

    Developer • Software Architect • Infra Lover • 2 years Hadoop DevOps • 3 years Elastic Guru 2
  3. 3 Education Engineer • User enablement • Content creation •

    Travel the world (16 months) • 4 continents • 16 countries • 160+ classes • 2400+ enabled users • 4 new trainings
  4. Which?? What?? When?? How?? 4 Elasticsearch Kibana Logstash Cloud Beats

    Prelert 2.3.1 4.5.2 2.2.0 1.2.3 ??? 1.2.3
  5. 5 2010 2012 2013 2014 2015 2016 First version of

    Elasticsearch
 released in February
  6. 6 2010 2012 2013 2014 2015 2016 Elasticsearch becomes a

    company Total cumulative downloads 2M
  7. 2010 Kibana and Logstash open source projects join Total cumulative

    downloads 5M 2012 2013 2014 2015 2016 7
  8. 2010 1.0 GA Elasticsearch Total cumulative downloads 18M 2012 2013

    2014 2015 2016 8
  9. 2010 1st Elastic{ON} user conference we are now Elastic Cloud

    acquired Beats team joins Total cumulative downloads 45M 2012 2013 2014 2015 2016 9
  10. 2010 2nd Elastic{ON} user conference ELK “Elastic Stack” Prelert acquired

    Total cumulative downloads 75M 2012 2013 2014 2015 2016 10
  11. 11 Kibana Elasticsearch Beats Logstash Security Alerting Monitoring Reporting X-Pack

    Graph Elastic Cloud
  12. 5.0 is here. All new versions. All aligned.

  13. 13

  14. 14 It doesn't make sense to hire smart people and

    then tell them what to do; we hire smart people so they can tell us what to do. Steve Jobs
  15. 15 TRUST "I don't want to monitor people or know

    where anyone is on a Tuesday at 2 PM."
  16. 16 We Are Everywhere

  17. 17 WE DON'T SELL SHIT

  18. 18 Usability

  19. 19 Everything should be made as simple as possible, but

    not simpler. Albert Einstein
  20. 20 Community

  21. ‹#› Pioneer Program https://www.elastic.co/blog/ elastic-pioneer-program

  22. 22 Different Real World Problems

  23. We Love It All 23

  24. 24 Without data you are just another person with an

    opinion. William Edwards Deming
  25. 25

  26. "Gotta Catch 'Em All" 26

  27. "Gotta Catch 'Em All" Cluster my_cluster 27 Server 1 Node

    A d1 d2 d3 d4 d5 d6 d7 d8 d9 d10 d11 d12 Index twitter d6 d3 d2 d5 d1 d4 Index logs
  28. Split Data Cluster my_cluster 28 Server 1 Node A d6

    d3 d2 d5 d1 d4 Index logs d1 d2 d3 d4 d5 d6 d7 d8 d9 d10 d11 d12 Index twitter Shard 0
  29. Cluster my_cluster 29 Server 2 Node B twitter shard 1

    Server 1 Node A d1 d2 d6 d5 d10 d12 twitter shard 3 twitter shard 4 d6 d3 d1 logs shard 0 d2 d5 d4 logs shard 1 d3 d4 d9 d7 d8 d11 twitter shard 2 twitter shard 0
  30. Distribute the Load Cluster my_cluster 30 Server 1 Node A

    Server 2 Node B Server 3 Node C Server 4 Node D
  31. 2 Shards for Logs and Metrics Cluster my_cluster 31 Server

    1 Node A Server 2 Node B Server 3 Node C Server 4 Node D NOT
 OPTIMAL
  32. 4 Shards for Logs and Metrics Cluster my_cluster 32 Server

    1 Node A Server 2 Node B Server 3 Node C Server 4 Node D BETTER But what about shard size?
  33. Math Time! • ~1000 events per second • 60 sec

    * 60 min * 24 hours * 1000 events => ~87M events per day • 1kb per event => ~82GB per day • 4 shards => ~20.5GB per shard • https://www.elastic.co/elasticon/conf/2016/sf/quantitative-cluster-sizing • For my use case, each shard should handle 45GB 33 4 shards per day is NOT OPTIMAL
  34. Deadlock! Cluster my_cluster 34 Server 1 Node A Server 2

    Node B Server 3 Node C Server 4 Node D Optimize Throughput Optimize Data Storage Which one is better?
  35. 35

  36. First, Maximize Throughput Cluster my_cluster 36 Server 1 Node A

    Server 2 Node B Server 3 Node C Server 4 Node D Create a daily index with one shard per node.
  37. Then, Maximize Storage Cluster my_cluster 37 Server 1 Node A

    Server 2 Node B Server 3 Node C Server 4 Node D Shrink the index to the optimal number of shards.
  38. Then, Maximize Storage Cluster my_cluster 38 Server 1 Node A

    Server 2 Node B Server 3 Node C Server 4 Node D Shrink the index to the optimal number of shards.
  39. Goals and Mechanisms • Goals • Achieve high ingest rates

    • Don't waste resources • Mechanisms • Daily Indices • Templates • Alias • Rollover • Shrink 39
  40. Daily Indices 40 Cluster my_cluster d6 d3 d2 d5 d1

    d4 logs-2016-10-19
  41. Daily Indices 41 Cluster my_cluster d6 d3 d2 d5 d1

    d4 logs-2016-10-19 d6 d3 d2 d5 d1 d4 logs-2016-10-20
  42. Daily Indices 42 Cluster my_cluster d6 d3 d2 d5 d1

    d4 logs-2016-10-19 d6 d3 d2 d5 d1 d4 logs-2016-10-21 d6 d3 d2 d5 d1 d4 logs-2016-10-20
  43. Templates Every new index starting with 'logs-' will have 4

    shards and '_all' disabled 43 PUT _template/logs { "template": "logs-*", "settings": { "number_of_shards": 4 } "mappings": { "_default_": { "_all": { "enabled": false } } } }
  44. Alias 44 Cluster my_cluster users Application d6 d3 d2 d5

    d1 d4 logs-2016-10-19 logs-write logs-read
  45. Alias 45 Cluster my_cluster users Application d6 d3 d2 d5

    d1 d4 logs-2016-10-19 d6 d3 d2 d5 d1 d4 logs-2016-10-20 logs-write logs-read
  46. Alias 46 Cluster my_cluster users Application d6 d3 d2 d5

    d1 d4 logs-2016-10-19 d6 d3 d2 d5 d1 d4 logs-2016-10-21 d6 d3 d2 d5 d1 d4 logs-2016-10-20 logs-write logs-read
  47. Templates Alias can also be defined in templates. 47 PUT

    _template/logs { "template": "logs-*", "settings": { "number_of_shards": 4 } "mappings": { ... } "aliases" : { "logs-write": {}, "logs-read": {} } } * you still need to remove "write" alias from previous index
  48. Let's Scale... 48 Cluster my_cluster Server 1 Node A Server

    2 Node B Server 3 Node C Server 4 Node D Application logs-write logs-read logs-write: logs-read:
  49. Let's Scale... 49 Cluster my_cluster Server 1 Node A Server

    2 Node B Server 3 Node C Server 4 Node D Application logs-write logs-read logs-write: logs-read:
  50. Let's Scale... 50 Cluster my_cluster Server 1 Node A Server

    2 Node B Server 3 Node C Server 4 Node D Application logs-write logs-read logs-write: logs-read:
  51. Let's Scale... 51 Cluster my_cluster Server 1 Node A Server

    2 Node B Server 3 Node C Server 4 Node D Application logs-write logs-read logs-write: logs-read:
  52. Let's Scale... 52 Cluster my_cluster Server 1 Node A Server

    2 Node B Server 3 Node C Server 4 Node D Application logs-write logs-read logs-write: logs-read:
  53. Let's Scale... 53 Cluster my_cluster Server 1 Node A Server

    2 Node B Server 3 Node C Server 4 Node D Application logs-write logs-read logs-write: logs-read:
  54. Let's Scale... 54 Cluster my_cluster Server 1 Node A Server

    2 Node B Server 3 Node C Server 4 Node D Application logs-write logs-read logs-write: logs-read:
  55. What is not here • JSON • the only way

    to use Elasticsearch • too verbose for presentations and you can always go back to the docs • Replicas • high availability • diagrams would be even worse • Hot/Warm/Cold Architecture • allow you to use the most of your hardware • diagrams would be even worse 55
  56. Where to go • Blog Post: • https://www.elastic.co/blog/managing-time-based-indices-efficiently • Shrink

    Docs: • https://www.elastic.co/guide/en/elasticsearch/reference/master/indices-shrink- index.html • Rollover Docs: • https://www.elastic.co/guide/en/elasticsearch/reference/master/indices-rollover- index.html 56
  57. Elastic Revolution Principles • Smart people should be free to

    be smart • Trust • Distributed • We don't sell shit • Usability and Simplicity • Different Real World Problems • Community 57
  58. We our community ‒ https://www.elastic.co (Website) ‒ https://www.elastic.co/learn (Learning Resources)

    ‒ https://www.elastic.co/community/meetups (Meetups) ‒ https://discuss.elastic.co (Discussion Forum) ‒ elasticsearch-pt@googlegroups.com (Lista em Português) 58
  59. 59 March 7-9, 2017 Pier 48 San Francisco, CA 2,500

    attendees Annual Elasticsearch User Conference SUBMIT A TALK: Call for Speakers Open SUBMIT A CAUSE: First Cause Awards https://www.elastic.co/elasticon/conf/2017/sf/registration Thank You! Questions?