Upgrade to Pro — share decks privately, control downloads, hide ads and more …

An Introduction to Elasticsearch

An Introduction to Elasticsearch

This is a presentation, I've created while I was researching on Elasticsearch. We now completely switched to Elasticsearch and really enjoy it. This is pretty biased by Alex Brasetvik's presentations. He also has sent most of slides I've used.

Avatar for Amir Sedighi

Amir Sedighi

November 20, 2014
Tweet

More Decks by Amir Sedighi

Other Decks in Programming

Transcript

  1. 3 • Thanks to Alex Brasetvik (@alexbrasetvik) from @foundsays, for

    the slides. • Thanks to Leslie Hawthorn (@lhawthorn) from @elasticsearch, for the stickers.
  2. Powered by Lucene, Search Stuffs • 1999 Doug Cutting •

    2003 Doug Cutting • 2004 Yonik Seeley • 2010 Shay Banon
  3. 5 • Full-Text Search Library. • Free & Open-Source •

    Features: – Indexes & Analyzes Data – Tokenizing – Filtering – Wildcards – Aggregation – Sorting
  4. 6 • Free and Open-Source • Java (Cross-platform) • Real-Time

    Analytical Search Engine • Distributed • Highly Available • RESTful
  5. 7

  6. 8

  7. 53 Memory • Search engines have a great appetite for

    memory! – Caches, caches, caches • Field and filter caches • Index building
  8. 54 Comparison • RDBMSs are built to store. They Put

    good things in memory, and will flush to disk when there is no memory. – Slower but working. – Timeout is a client matter. • Search-Engines are built for speed. – Fast running or not running. – Assumption: You've provided enough memory.
  9. 57 Out Of Memory • In the best case: –

    Your Indexing or Search Request simply failed. • More: – Cluster state corrupted. – Crashed Netty. • Just don't end up there in your production cluster.
  10. 58 Warning Signs • ES provides lots of end-points to

    give you insights into it. – Resource Usage • Cache Sizes • Heap Space • There are Monitoring Tools. – Profile your queries and optimize them.
  11. 63 Memory Constraints • Large heaps are expensive to garbage

    collect. – JVM can no longer user pointer compression if heap goes beyond 32GB. – Keep heap < 32GB • Single Machine with Huge amount of Memory/SSD. – Multiple nodes on super-fast machine with SSD and big amount of RAM. (Note: Replicas, SPF) • Scale-Out
  12. 64 Security • Everyone is most welcome. • Auth(z) things

    aren't ES business. – You are the gatekeeper • Upon the role, limit the user requests applying filters. – Out of memory is a critical issue. (Attacks) – Unfiltered or unnecessary queries are pretty memory consuming.
  13. 66 Networking • ES works great, on a single node.

    • ES is impressively easy to use for being a distributed system. • ES Supports lots of different network topologies.
  14. 70 Suggestions • Have enough memory to keep your nodes

    reliable. • Have majority of nodes. • Favor filters over matching queries. • Have an eye on the cluster (Health). • Don't let user to run faceted queries or reduce the frequency.