Upgrade to Pro — share decks privately, control downloads, hide ads and more …

What's Evolving in Elasticsearch

Elastic Co
February 17, 2016

What's Evolving in Elasticsearch

Elastic's Clinton Gormley and Simon Willnauer present the latest happenings in Elasticsearch at Elastic{ON}16, February 17, 2016, in San Francisco.

Elastic Co

February 17, 2016
Tweet

More Decks by Elastic Co

Other Decks in Technology

Transcript

  1. 3 CRUD, Mapping, Search, Query DSL, Highlighting, Aliases, Cluster /

    Nodes / Indices Stats, Facets, Scripting, Index Templates, Dynamic Mapping, Parent/Child, Nested Objects, Realtime GET, Versioning, Routing, Percolator, Geo-points, Geo-shapes, Suggesters 0.x 1.x 2.x 5.x
  2. 4 0.x 1.x 2.x 5.x Aggregations, CAT API, Distributed Percolator,

    Doc Values, Snapshot/Restore, Tribe Node, Circuit Breakers, Search Templates, Completion Suggester, Lucene Scripting, Top Hits
  3. 5 0.x 1.x 2.x 5.x Aggregations, CAT API, Distributed Percolator,

    Doc Values, Snapshot/Restore, Tribe Node, Circuit Breakers, Search Templates, Completion Suggester, Lucene Scripting, Top Hits OOM Corruption Split Brain Security Exploits
  4. 6 0.x 1.x 2.x 5.x Aggregations, CAT API, Distributed Percolator,

    Doc Values, Snapshot/Restore, Tribe Node, Circuit Breakers, Search Templates, Completion Suggester, Lucene Scripting, Top Hits
  5. ‹#› A rolling restart of an #elasticsearch cluster reduced from

    8 days to 30 minutes! @elastic win! #elasticon Ash Kapow - @ashkapow
  6. ‹#› Flexible 11 Great Mappings Refactoring Image by Abraham Ortelius

    - The Library of Congress, Public Domain, https://commons.wikimedia.org/w/ index.php?curid=6872417
  7. ‹#› Too Flexible… 12 Great Mappings Refactoring Image by Abraham

    Ortelius - The Library of Congress, Public Domain, https://commons.wikimedia.org/w/ index.php?curid=6872417
  8. ‹#› Too Flexible Ambiguous Silent failures Index corruption 13 Great

    Mappings Refactoring Image by Abraham Ortelius - The Library of Congress, Public Domain, https://commons.wikimedia.org/w/ index.php?curid=6872417
  9. ‹#› Now Consistent Reliable Solid foundation 14 Great Mappings Refactoring

    Image by Abraham Ortelius - The Library of Congress, Public Domain, https://commons.wikimedia.org/w/ index.php?curid=6872417
  10. ‹#› 15 Translog reliability Image licensed under CC BY-SA 3.0,

    https:// commons.wikimedia.org/w/index.php?curid=157542
  11. ‹#› 16 Translog reliability • Silent corruption • Data loss

    • Confusing semantics Image licensed under CC BY-SA 3.0, https:// commons.wikimedia.org/w/index.php?curid=157542
  12. ‹#› 17 Translog reliability • Durable by default • Atomic

    writes • Fsync after each request Image licensed under CC BY-SA 3.0, https:// commons.wikimedia.org/w/index.php?curid=157542
  13. ‹#› 24 Heap usage Better data structures Image by Laserlicht

    (Own work) [CC BY-SA 3.0 (http:// creativecommons.org/licenses/by-sa/3.0)], via Wikimedia Commons
  14. ‹#› 25 Fast Doc-values Columnar store File-system cache On by

    default Image by Laserlicht (Own work) [CC BY-SA 3.0 (http:// creativecommons.org/licenses/by-sa/3.0)], via Wikimedia Commons Heap usage Better data structures
  15. ‹#› 26 Field length norms Relevance statistic File-system cache Sparse

    data structure Image by Laserlicht (Own work) [CC BY-SA 3.0 (http:// creativecommons.org/licenses/by-sa/3.0)], via Wikimedia Commons Heap usage Better data structures
  16. ‹#› 27 Parent/Child v2 Doc values Faster joins Image by

    Laserlicht (Own work) [CC BY-SA 3.0 (http:// creativecommons.org/licenses/by-sa/3.0)], via Wikimedia Commons Heap usage Better data structures
  17. ‹#› 28 Geo-points v2 Doc values 50% index size Much

    faster queries Image by Laserlicht (Own work) [CC BY-SA 3.0 (http:// creativecommons.org/licenses/by-sa/3.0)], via Wikimedia Commons Heap usage Better data structures
  18. ‹#› 29 Security Image by User:Nino Barbieri - Own work,

    CC BY-SA 2.5, https://commons.wikimedia.org/w/index.php? curid=1766599
  19. ‹#› Image by User:Nino Barbieri - Own work, CC BY-SA

    2.5, https://commons.wikimedia.org/w/index.php? curid=1766599 JarHell check 30 Security
  20. ‹#› Image by User:Nino Barbieri - Own work, CC BY-SA

    2.5, https://commons.wikimedia.org/w/index.php? curid=1766599 JarHell check Java Security Manager 31 Security
  21. ‹#› Image by User:Nino Barbieri - Own work, CC BY-SA

    2.5, https://commons.wikimedia.org/w/index.php? curid=1766599 JarHell check Java Security Manager Minimal privileges 32 Security
  22. ‹#› Image by User:Nino Barbieri - Own work, CC BY-SA

    2.5, https://commons.wikimedia.org/w/index.php? curid=1766599 JarHell check Java Security Manager Minimal privileges Modularisation of core 33 Security
  23. ‹#› Image by User:Nino Barbieri - Own work, CC BY-SA

    2.5, https://commons.wikimedia.org/w/index.php? curid=1766599 JarHell check Java Security Manager Minimal privileges Modularisation of core No Java serialisation 34 Security
  24. ‹#› Image by CSIRO, CC BY 3.0, https:// commons.wikimedia.org/w/index.php?curid=35458192 36

    Tests Tests Tests We can't claim to support it… …unless we actually test it. Robert Muir
  25. ‹#› Image by CSIRO, CC BY 3.0, https:// commons.wikimedia.org/w/index.php?curid=35458192 39

    Tests Tests Tests Unit testable code Core plugins with tests Real integration tests
  26. ‹#› Image by CSIRO, CC BY 3.0, https:// commons.wikimedia.org/w/index.php?curid=35458192 40

    Tests Tests Tests Unit testable code Core plugins with tests Real integration tests Leniency
  27. 53 • Quick bug fixes • Access to new features

    • Stability • Easy upgrades Users want Developers want Release schedule
  28. Release schedule 54 • Quick bug fixes • Access to

    new features • Stability • Easy upgrades Users want • To see their code being used • Reduced complexity • To move forward fast Developers want
  29. 56 • Current minor version • Last minor of previous

    major version • Bug fixes, small enhancements only Bugfix Releases Minor releases Major releases Release schedule
  30. 57 • Current minor version • Last minor of previous

    major version • Bug fixes, small enhancements only Bugfix Releases • Frequent • Smaller • No breaking changes, unless required Minor releases Major releases Release schedule
  31. 58 • Current minor version • Last minor of previous

    major version • Bug fixes, small enhancements only Bugfix Releases • Frequent • Smaller • No breaking changes, unless required Minor releases • More frequent • Smaller • Upgrade from any minor version of the previous major version Major releases Release schedule
  32. 59 • Current minor version • Last minor of previous

    major version • Bug fixes, small enhancements only Bugfix Releases • Frequent • Smaller • No breaking changes, unless required Minor releases • More frequent • Smaller • Upgrade from any minor version of the previous major version Major releases Release schedule What about old data?
  33. ‹#› 60 Reindex API POST _reindex { "src": { "index":

    "old_index" }, "dest": { "index": "new_index" } }
  34. ‹#› 61 Reindex API POST _reindex { "src": { "index":

    "old_index", "query": { "match": { "user": "twitter" } } }, "dest": { "index": "new_index" } }
  35. ‹#› 62 Update by Query POST users/_update_by_query { "query": {

    "match": { "user": "twitter" } }, "script": "...." } 62
  36. ‹#› Data transformation Dec 23 14:30:01 louis CRON[619]: (www-data) CMD

    (php /usr/share/ cacti/site/poller.php >/dev/null 2>/var/log/cacti/poller-error.log) 67 Ingest Node
  37. ‹#› Data transformation { "@timestamp": "2013-12-23T22:30:01.000Z", "syslog_timestamp": "Dec 23 14:30:01",

    "syslog_hostname": "louis", "syslog_program": "CRON", "syslog_pid": "619", "received_at": "2013-12-23 22:49:22 UTC", "received_from": "0:0:0:0:0:0:0:1:52617", "syslog_message": "(www-data) CMD (php / usr/share/cacti/site/poller.php >/dev/null 2>/var/log/cacti/poller-error.log)", 68 Ingest Node
  38. ‹#› Data enrichment { "ip": "45.24.12.8", "geoip": { "continent_name": "North

    America", "country_iso_code": "US", "region_name": "Michigan", "city_name": "Westland", "location": { "lon": -83.3686, "lat": 42.289 } 70 Ingest Node
  39. ‹#› Wraps index/bulk API POST logs/apache?pipeline=apache_logs { … document body

    … } POST logs/apache/_bulk { "index": { "pipeline": "apache_logs" }} { … document body … } … … 71 Ingest Node
  40. ‹#› 75 Painless Scripting Dynamic/Static def first = input.doc.first_name.0; def

    last = input.doc.last_name.0; return first + " " + last;
  41. ‹#› 76 Painless Scripting Dynamic/Static String first = (String)((List) ((Map)input.get("doc"))

    .get("first_name")).get(0); String last = (String)((List) ((Map)input.get("doc")) .get("last_name")).get(0); return first + " " + last;
  42. ‹#› 77 Painless Scripting Dynamic/Static String first = (String)((List) ((Map)input.get("doc"))

    .get("first_name")).get(0); String last = (String)((List) ((Map)input.get("doc")) .get("last_name")).get(0); return first + " " + last; 10x faster!
  43. ‹#› 78 Image by Laserlicht (Own work) [CC BY-SA 3.0

    (http:// creativecommons.org/licenses/by-sa/3.0)], via Wikimedia Commons Data structures
  44. ‹#› 79 String Mappings { "type": "string", "index: "not_analyzed" }

    Image by Laserlicht (Own work) [CC BY-SA 3.0 (http:// creativecommons.org/licenses/by-sa/3.0)], via Wikimedia Commons Data structures
  45. ‹#› 80 String Mappings { "type": "text", "index: true }

    { "type": "keyword", "index: true } Image by Laserlicht (Own work) [CC BY-SA 3.0 (http:// creativecommons.org/licenses/by-sa/3.0)], via Wikimedia Commons Data structures
  46. ‹#› 81 String Mappings { "type": "text"} Full text field

    Full analysis chain Full text relevance Image by Laserlicht (Own work) [CC BY-SA 3.0 (http:// creativecommons.org/licenses/by-sa/3.0)], via Wikimedia Commons Data structures
  47. ‹#› 82 String Mappings { "type": "keyword"} Concrete string value

    Limited keyword analysis Doc values Image by Laserlicht (Own work) [CC BY-SA 3.0 (http:// creativecommons.org/licenses/by-sa/3.0)], via Wikimedia Commons Data structures
  48. ‹#› 83 Point field encoding Image by Laserlicht (Own work)

    [CC BY-SA 3.0 (http:// creativecommons.org/licenses/by-sa/3.0)], via Wikimedia Commons Data structures
  49. ‹#› 84 Point field encoding k-dimensional tree Replace encoding for

    Numeric/Date/IPv4 Image by Laserlicht (Own work) [CC BY-SA 3.0 (http:// creativecommons.org/licenses/by-sa/3.0)], via Wikimedia Commons Data structures
  50. NumericField vs PointField 85 0% 25% 50% 75% 100% Index

    Size Index Time Search Time Search Time Heap Usage 15% 76% 49% 49% 100% 100% 100% 100% NumericField PointField
  51. ‹#› 86 Point field encoding k-dimensional tree Allows support for

    • BigInteger • BigDecimal • IPv6 Image by Laserlicht (Own work) [CC BY-SA 3.0 (http:// creativecommons.org/licenses/by-sa/3.0)], via Wikimedia Commons Data structures
  52. ‹#› 87 Point field encoding k-dimensional tree Image by Laserlicht

    (Own work) [CC BY-SA 3.0 (http:// creativecommons.org/licenses/by-sa/3.0)], via Wikimedia Commons Data structures
  53. ‹#› 88 Point field encoding k-dimensional tree where 1 =<

    k <= 8 Image by Laserlicht (Own work) [CC BY-SA 3.0 (http:// creativecommons.org/licenses/by-sa/3.0)], via Wikimedia Commons Data structures
  54. ‹#› 89 Point field encoding k-dimensional tree k=2: Lat/Lon k=3:

    3D Geo Points k=?: …. Image by Laserlicht (Own work) [CC BY-SA 3.0 (http:// creativecommons.org/licenses/by-sa/3.0)], via Wikimedia Commons Data structures
  55. ‹#› 90 Geo Point Fields vs Geo Shape Fields Image

    by Laserlicht (Own work) [CC BY-SA 3.0 (http:// creativecommons.org/licenses/by-sa/3.0)], via Wikimedia Commons Data structures
  56. ‹#› 91 Geo Fields Image by Laserlicht (Own work) [CC

    BY-SA 3.0 (http:// creativecommons.org/licenses/by-sa/3.0)], via Wikimedia Commons Data structures
  57. ‹#› 92 Geo Fields Experimental! Share encoding Combined functionality Image

    by Laserlicht (Own work) [CC BY-SA 3.0 (http:// creativecommons.org/licenses/by-sa/3.0)], via Wikimedia Commons Data structures
  58. ‹#› Completion suggester Document oriented Respects deletes Boost by: •

    prefix length • context • geolocation 94 Search
  59. ‹#› Search After "size": 10, "sort": [ { "age": "asc"

    }, { "_uid": "desc" } ], "search_after": [ 18, "tweet#654323" ] 96 Search
  60. ‹#› 99 Java HTTP Client • Decouple server/client • core

    becomes server • Single system boundary
  61. ‹#› 100 Java HTTP Client • Aligns clients across languages

    • Eating our own dog food • Minimal dependencies
  62. ‹#› Except where otherwise noted, this work is licensed under

    http://creativecommons.org/licenses/by-nd/4.0/ Creative Commons and the double C in a circle are registered trademarks of Creative Commons in the United States and other countries. Third party marks and brands are the property of their respective holders. 105