Upgrade to Pro — share decks privately, control downloads, hide ads and more …

What's Evolving in Elasticsearch

Sponsored · SiteGround - Reliable hosting with speed, security, and support you can count on.
Avatar for Elastic Co Elastic Co
February 17, 2016

What's Evolving in Elasticsearch

Elastic's Clinton Gormley and Simon Willnauer present the latest happenings in Elasticsearch at Elastic{ON}16, February 17, 2016, in San Francisco.

Avatar for Elastic Co

Elastic Co

February 17, 2016
Tweet

More Decks by Elastic Co

Other Decks in Technology

Transcript

  1. 3 CRUD, Mapping, Search, Query DSL, Highlighting, Aliases, Cluster /

    Nodes / Indices Stats, Facets, Scripting, Index Templates, Dynamic Mapping, Parent/Child, Nested Objects, Realtime GET, Versioning, Routing, Percolator, Geo-points, Geo-shapes, Suggesters 0.x 1.x 2.x 5.x
  2. 4 0.x 1.x 2.x 5.x Aggregations, CAT API, Distributed Percolator,

    Doc Values, Snapshot/Restore, Tribe Node, Circuit Breakers, Search Templates, Completion Suggester, Lucene Scripting, Top Hits
  3. 5 0.x 1.x 2.x 5.x Aggregations, CAT API, Distributed Percolator,

    Doc Values, Snapshot/Restore, Tribe Node, Circuit Breakers, Search Templates, Completion Suggester, Lucene Scripting, Top Hits OOM Corruption Split Brain Security Exploits
  4. 6 0.x 1.x 2.x 5.x Aggregations, CAT API, Distributed Percolator,

    Doc Values, Snapshot/Restore, Tribe Node, Circuit Breakers, Search Templates, Completion Suggester, Lucene Scripting, Top Hits
  5. ‹#› A rolling restart of an #elasticsearch cluster reduced from

    8 days to 30 minutes! @elastic win! #elasticon Ash Kapow - @ashkapow
  6. ‹#› Flexible 11 Great Mappings Refactoring Image by Abraham Ortelius

    - The Library of Congress, Public Domain, https://commons.wikimedia.org/w/ index.php?curid=6872417
  7. ‹#› Too Flexible… 12 Great Mappings Refactoring Image by Abraham

    Ortelius - The Library of Congress, Public Domain, https://commons.wikimedia.org/w/ index.php?curid=6872417
  8. ‹#› Too Flexible Ambiguous Silent failures Index corruption 13 Great

    Mappings Refactoring Image by Abraham Ortelius - The Library of Congress, Public Domain, https://commons.wikimedia.org/w/ index.php?curid=6872417
  9. ‹#› Now Consistent Reliable Solid foundation 14 Great Mappings Refactoring

    Image by Abraham Ortelius - The Library of Congress, Public Domain, https://commons.wikimedia.org/w/ index.php?curid=6872417
  10. ‹#› 15 Translog reliability Image licensed under CC BY-SA 3.0,

    https:// commons.wikimedia.org/w/index.php?curid=157542
  11. ‹#› 16 Translog reliability • Silent corruption • Data loss

    • Confusing semantics Image licensed under CC BY-SA 3.0, https:// commons.wikimedia.org/w/index.php?curid=157542
  12. ‹#› 17 Translog reliability • Durable by default • Atomic

    writes • Fsync after each request Image licensed under CC BY-SA 3.0, https:// commons.wikimedia.org/w/index.php?curid=157542
  13. ‹#› 24 Heap usage Better data structures Image by Laserlicht

    (Own work) [CC BY-SA 3.0 (http:// creativecommons.org/licenses/by-sa/3.0)], via Wikimedia Commons
  14. ‹#› 25 Fast Doc-values Columnar store File-system cache On by

    default Image by Laserlicht (Own work) [CC BY-SA 3.0 (http:// creativecommons.org/licenses/by-sa/3.0)], via Wikimedia Commons Heap usage Better data structures
  15. ‹#› 26 Field length norms Relevance statistic File-system cache Sparse

    data structure Image by Laserlicht (Own work) [CC BY-SA 3.0 (http:// creativecommons.org/licenses/by-sa/3.0)], via Wikimedia Commons Heap usage Better data structures
  16. ‹#› 27 Parent/Child v2 Doc values Faster joins Image by

    Laserlicht (Own work) [CC BY-SA 3.0 (http:// creativecommons.org/licenses/by-sa/3.0)], via Wikimedia Commons Heap usage Better data structures
  17. ‹#› 28 Geo-points v2 Doc values 50% index size Much

    faster queries Image by Laserlicht (Own work) [CC BY-SA 3.0 (http:// creativecommons.org/licenses/by-sa/3.0)], via Wikimedia Commons Heap usage Better data structures
  18. ‹#› 29 Security Image by User:Nino Barbieri - Own work,

    CC BY-SA 2.5, https://commons.wikimedia.org/w/index.php? curid=1766599
  19. ‹#› Image by User:Nino Barbieri - Own work, CC BY-SA

    2.5, https://commons.wikimedia.org/w/index.php? curid=1766599 JarHell check 30 Security
  20. ‹#› Image by User:Nino Barbieri - Own work, CC BY-SA

    2.5, https://commons.wikimedia.org/w/index.php? curid=1766599 JarHell check Java Security Manager 31 Security
  21. ‹#› Image by User:Nino Barbieri - Own work, CC BY-SA

    2.5, https://commons.wikimedia.org/w/index.php? curid=1766599 JarHell check Java Security Manager Minimal privileges 32 Security
  22. ‹#› Image by User:Nino Barbieri - Own work, CC BY-SA

    2.5, https://commons.wikimedia.org/w/index.php? curid=1766599 JarHell check Java Security Manager Minimal privileges Modularisation of core 33 Security
  23. ‹#› Image by User:Nino Barbieri - Own work, CC BY-SA

    2.5, https://commons.wikimedia.org/w/index.php? curid=1766599 JarHell check Java Security Manager Minimal privileges Modularisation of core No Java serialisation 34 Security
  24. ‹#› Image by CSIRO, CC BY 3.0, https:// commons.wikimedia.org/w/index.php?curid=35458192 36

    Tests Tests Tests We can't claim to support it… …unless we actually test it. Robert Muir
  25. ‹#› Image by CSIRO, CC BY 3.0, https:// commons.wikimedia.org/w/index.php?curid=35458192 39

    Tests Tests Tests Unit testable code Core plugins with tests Real integration tests
  26. ‹#› Image by CSIRO, CC BY 3.0, https:// commons.wikimedia.org/w/index.php?curid=35458192 40

    Tests Tests Tests Unit testable code Core plugins with tests Real integration tests Leniency
  27. 53 • Quick bug fixes • Access to new features

    • Stability • Easy upgrades Users want Developers want Release schedule
  28. Release schedule 54 • Quick bug fixes • Access to

    new features • Stability • Easy upgrades Users want • To see their code being used • Reduced complexity • To move forward fast Developers want
  29. 56 • Current minor version • Last minor of previous

    major version • Bug fixes, small enhancements only Bugfix Releases Minor releases Major releases Release schedule
  30. 57 • Current minor version • Last minor of previous

    major version • Bug fixes, small enhancements only Bugfix Releases • Frequent • Smaller • No breaking changes, unless required Minor releases Major releases Release schedule
  31. 58 • Current minor version • Last minor of previous

    major version • Bug fixes, small enhancements only Bugfix Releases • Frequent • Smaller • No breaking changes, unless required Minor releases • More frequent • Smaller • Upgrade from any minor version of the previous major version Major releases Release schedule
  32. 59 • Current minor version • Last minor of previous

    major version • Bug fixes, small enhancements only Bugfix Releases • Frequent • Smaller • No breaking changes, unless required Minor releases • More frequent • Smaller • Upgrade from any minor version of the previous major version Major releases Release schedule What about old data?
  33. ‹#› 60 Reindex API POST _reindex { "src": { "index":

    "old_index" }, "dest": { "index": "new_index" } }
  34. ‹#› 61 Reindex API POST _reindex { "src": { "index":

    "old_index", "query": { "match": { "user": "twitter" } } }, "dest": { "index": "new_index" } }
  35. ‹#› 62 Update by Query POST users/_update_by_query { "query": {

    "match": { "user": "twitter" } }, "script": "...." } 62
  36. ‹#› Data transformation Dec 23 14:30:01 louis CRON[619]: (www-data) CMD

    (php /usr/share/ cacti/site/poller.php >/dev/null 2>/var/log/cacti/poller-error.log) 67 Ingest Node
  37. ‹#› Data transformation { "@timestamp": "2013-12-23T22:30:01.000Z", "syslog_timestamp": "Dec 23 14:30:01",

    "syslog_hostname": "louis", "syslog_program": "CRON", "syslog_pid": "619", "received_at": "2013-12-23 22:49:22 UTC", "received_from": "0:0:0:0:0:0:0:1:52617", "syslog_message": "(www-data) CMD (php / usr/share/cacti/site/poller.php >/dev/null 2>/var/log/cacti/poller-error.log)", 68 Ingest Node
  38. ‹#› Data enrichment { "ip": "45.24.12.8", "geoip": { "continent_name": "North

    America", "country_iso_code": "US", "region_name": "Michigan", "city_name": "Westland", "location": { "lon": -83.3686, "lat": 42.289 } 70 Ingest Node
  39. ‹#› Wraps index/bulk API POST logs/apache?pipeline=apache_logs { … document body

    … } POST logs/apache/_bulk { "index": { "pipeline": "apache_logs" }} { … document body … } … … 71 Ingest Node
  40. ‹#› 75 Painless Scripting Dynamic/Static def first = input.doc.first_name.0; def

    last = input.doc.last_name.0; return first + " " + last;
  41. ‹#› 76 Painless Scripting Dynamic/Static String first = (String)((List) ((Map)input.get("doc"))

    .get("first_name")).get(0); String last = (String)((List) ((Map)input.get("doc")) .get("last_name")).get(0); return first + " " + last;
  42. ‹#› 77 Painless Scripting Dynamic/Static String first = (String)((List) ((Map)input.get("doc"))

    .get("first_name")).get(0); String last = (String)((List) ((Map)input.get("doc")) .get("last_name")).get(0); return first + " " + last; 10x faster!
  43. ‹#› 78 Image by Laserlicht (Own work) [CC BY-SA 3.0

    (http:// creativecommons.org/licenses/by-sa/3.0)], via Wikimedia Commons Data structures
  44. ‹#› 79 String Mappings { "type": "string", "index: "not_analyzed" }

    Image by Laserlicht (Own work) [CC BY-SA 3.0 (http:// creativecommons.org/licenses/by-sa/3.0)], via Wikimedia Commons Data structures
  45. ‹#› 80 String Mappings { "type": "text", "index: true }

    { "type": "keyword", "index: true } Image by Laserlicht (Own work) [CC BY-SA 3.0 (http:// creativecommons.org/licenses/by-sa/3.0)], via Wikimedia Commons Data structures
  46. ‹#› 81 String Mappings { "type": "text"} Full text field

    Full analysis chain Full text relevance Image by Laserlicht (Own work) [CC BY-SA 3.0 (http:// creativecommons.org/licenses/by-sa/3.0)], via Wikimedia Commons Data structures
  47. ‹#› 82 String Mappings { "type": "keyword"} Concrete string value

    Limited keyword analysis Doc values Image by Laserlicht (Own work) [CC BY-SA 3.0 (http:// creativecommons.org/licenses/by-sa/3.0)], via Wikimedia Commons Data structures
  48. ‹#› 83 Point field encoding Image by Laserlicht (Own work)

    [CC BY-SA 3.0 (http:// creativecommons.org/licenses/by-sa/3.0)], via Wikimedia Commons Data structures
  49. ‹#› 84 Point field encoding k-dimensional tree Replace encoding for

    Numeric/Date/IPv4 Image by Laserlicht (Own work) [CC BY-SA 3.0 (http:// creativecommons.org/licenses/by-sa/3.0)], via Wikimedia Commons Data structures
  50. NumericField vs PointField 85 0% 25% 50% 75% 100% Index

    Size Index Time Search Time Search Time Heap Usage 15% 76% 49% 49% 100% 100% 100% 100% NumericField PointField
  51. ‹#› 86 Point field encoding k-dimensional tree Allows support for

    • BigInteger • BigDecimal • IPv6 Image by Laserlicht (Own work) [CC BY-SA 3.0 (http:// creativecommons.org/licenses/by-sa/3.0)], via Wikimedia Commons Data structures
  52. ‹#› 87 Point field encoding k-dimensional tree Image by Laserlicht

    (Own work) [CC BY-SA 3.0 (http:// creativecommons.org/licenses/by-sa/3.0)], via Wikimedia Commons Data structures
  53. ‹#› 88 Point field encoding k-dimensional tree where 1 =<

    k <= 8 Image by Laserlicht (Own work) [CC BY-SA 3.0 (http:// creativecommons.org/licenses/by-sa/3.0)], via Wikimedia Commons Data structures
  54. ‹#› 89 Point field encoding k-dimensional tree k=2: Lat/Lon k=3:

    3D Geo Points k=?: …. Image by Laserlicht (Own work) [CC BY-SA 3.0 (http:// creativecommons.org/licenses/by-sa/3.0)], via Wikimedia Commons Data structures
  55. ‹#› 90 Geo Point Fields vs Geo Shape Fields Image

    by Laserlicht (Own work) [CC BY-SA 3.0 (http:// creativecommons.org/licenses/by-sa/3.0)], via Wikimedia Commons Data structures
  56. ‹#› 91 Geo Fields Image by Laserlicht (Own work) [CC

    BY-SA 3.0 (http:// creativecommons.org/licenses/by-sa/3.0)], via Wikimedia Commons Data structures
  57. ‹#› 92 Geo Fields Experimental! Share encoding Combined functionality Image

    by Laserlicht (Own work) [CC BY-SA 3.0 (http:// creativecommons.org/licenses/by-sa/3.0)], via Wikimedia Commons Data structures
  58. ‹#› Completion suggester Document oriented Respects deletes Boost by: •

    prefix length • context • geolocation 94 Search
  59. ‹#› Search After "size": 10, "sort": [ { "age": "asc"

    }, { "_uid": "desc" } ], "search_after": [ 18, "tweet#654323" ] 96 Search
  60. ‹#› 99 Java HTTP Client • Decouple server/client • core

    becomes server • Single system boundary
  61. ‹#› 100 Java HTTP Client • Aligns clients across languages

    • Eating our own dog food • Minimal dependencies
  62. ‹#› Except where otherwise noted, this work is licensed under

    http://creativecommons.org/licenses/by-nd/4.0/ Creative Commons and the double C in a circle are registered trademarks of Creative Commons in the United States and other countries. Third party marks and brands are the property of their respective holders. 105