Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Elasticsearch & Data Consistency

Drew Raines
September 05, 2013

Elasticsearch & Data Consistency

Washington, DC Elasticsearch Meetup

How does an update-heavy environment leverage Elasticsearch? We talk about what goes well and what doesn't, along with a useful case study.

Drew Raines

September 05, 2013
Tweet

More Decks by Drew Raines

Other Decks in Technology

Transcript

  1. Elasticsearch & Data Consistency @drewr Thursday, September 5, 13 *

    Introduction * Local training Oct 21-22, early bird ends Sept 21 * Why this topic? - I come from a mostly read-only archiving use, lots of people ask about update perf...
  2. Search Thursday, September 5, 13 * Document-based (JSON) * Boolean,

    fuzzy suggesters, phrase, custom scoring * Lucene
  3. Data Thursday, September 5, 13 * Keep a copy of

    the document * Don’t have to go to another data store
  4. Fast Thursday, September 5, 13 * Distributed and concurrent from

    the ground up * Only locks for some disk writes
  5. Thursday, September 5, 13 Indices are a lightweight abstraction over

    shards This is a single index with two shards Add a node...
  6. Thursday, September 5, 13 Nothing gained by moving anything to

    it (caveat with perf) Tell ES you want a copy of the data for availability
  7. P R R P Thursday, September 5, 13 Two nodes,

    four shards For search, completely identical
  8. Thursday, September 5, 13 * Writes are always indexing operations

    * Doc comes in with ID, or we generate one, hash it (or routing value), give it a unique shard
  9. Thursday, September 5, 13 * Primary then replicas * What

    happens when a search request comes in? * Might choose primary or replica (can tell it primary, but you may not care)
  10. Thursday, September 5, 13 * Primary then replicas * What

    happens when a search request comes in? * Might choose primary or replica (can tell it primary, but you may not care)
  11. ? ? ? ? Thursday, September 5, 13 * Might

    choose primary or replica (can tell it primary, but you may not care) * What if replica doesn’t have the doc yet?
  12. Consistency Thursday, September 5, 13 * Eventually consistent (optimistic replication,

    every copy will eventually see the update) * Crazy! I need transactions!
  13. PUT /foo/t/one { "age": 35, "bio": "Some long text", "name":

    "Foo" } Thursday, September 5, 13 Let’s index a doc
  14. POST /foo/t/one/_update { "doc": { "name": "Bar" } } Thursday,

    September 5, 13 * We replace just that field * Save network... maybe we can sneak our update in!
  15. POST /foo/t/one/_update { "params": { "ageinc": 1 }, "script": "ctx._source.age

    += ageinc" } Thursday, September 5, 13 * Or a script * Same effect...
  16. PUT /foo/t/one { "age": 36, "bio": "Some long text", "name":

    "Bar" } Thursday, September 5, 13 Or we can just PUT it again
  17. A write is a write... Thursday, September 5, 13 *

    Index, update, delete.... Same end result! * Immutability! * Optimistic concurrency control * Fast, no locks, no magic, HTTP is stateless anyway (ETags)
  18. { "_id": "one", "_index": "foo", "_type": "t", "_version": 1, "ok":

    true } Thursday, September 5, 13 Look at response again...
  19. PUT /foo/t/one { "age": 36, "bio": "Some long text", "name":

    "Baz" } Thursday, September 5, 13 PUT it again
  20. PUT /foo/t/one?version=1 { "age": 36, "bio": "Some long text", "name":

    "Baz" } Thursday, September 5, 13 PUT it again
  21. PUT /foo/t/one?version=2 { "age": 36, "bio": "Some long text", "name":

    "Baz" } Thursday, September 5, 13 Version 2 this time...
  22. PUT /foo/t/one?version=3&refresh=true { "age": 36, "bio": "Some long text", "name":

    "Quux" } Thursday, September 5, 13 Version 2 this time...
  23. Thursday, September 5, 13 Lightweight work queue * No new

    deployment (already have nice distributed data store) * Can lose some perf during consume (publish a bunch and walk away, workers busy anyway) * Just want a highly available place to store messages
  24. No New Deployment Thursday, September 5, 13 Lightweight work queue

    * No new deployment (already have nice distributed data store) * Can lose some perf during consume (publish a bunch and walk away, workers busy anyway) * Just want a highly available place to store messages
  25. No New Deployment Consume perf loss OK Thursday, September 5,

    13 Lightweight work queue * No new deployment (already have nice distributed data store) * Can lose some perf during consume (publish a bunch and walk away, workers busy anyway) * Just want a highly available place to store messages
  26. No New Deployment Consume perf loss OK Don’t really care

    about AMQP Thursday, September 5, 13 Lightweight work queue * No new deployment (already have nice distributed data store) * Can lose some perf during consume (publish a bunch and walk away, workers busy anyway) * Just want a highly available place to store messages
  27. Rabbit Elasticsearch Exchange Index Queue Type Routing key ? Thursday,

    September 5, 13 We don’t care about routing it to any particular place.... just store it and we can find it
  28. news.politics.uk news.sports news.politics.usa news.* trade.dow trade.nasdaq Thursday, September 5, 13

    Routing keys are just queues. Queries can be anything, in this case prefix matching a few types.
  29. Search! Thursday, September 5, 13 * Can create different exchanges

    with different shard configurations and risk profiles * Can look for newest message in queue, or ones from Tuesday * Can look for ones containing the job referencing cats
  30. POST /exch/test.foo { "op": "frob", "thing": "/over/there" } Publish... Thursday,

    September 5, 13 Consuming a message is a reindex as unack’ed, maintaining concurrency control Get back _id
  31. GET /exch/test.foo/_search { "query": { "bool": { "must": [ {

    "constant_score": { "filter": { "missing": { "existence": true, "field": "__q_status", "null_value": true } } } }, { "prefix": { "_type": "test.foo" } } ] } }, "size": 1, "sort": [ { "__q_control": { "order": "desc" } } ] } Consume... Thursday, September 5, 13 Search looks for anything that’s not unacked or acked, just return one Get _id
  32. PUT /exch/test.foo/UN1QU3ID?version=2&refresh=true { ... "__q_status": "ack" } Finish up... Thursday,

    September 5, 13 When we’re done, we ack it... Won’t show up in consumption queries *** Remember, this works because when we try to unack (start working), we will get a failure if someone else already has it. Not free to work until successful unack.
  33. Performance Thursday, September 5, 13 Publishing... ES is a champ

    20k msgs/sec on little 4-core MBA Consuming... constant refresh is slow... segment explosion, lots of merging, but still 50-100 msgs/sec In the typical work queue scenario, adding a few ms or secs to a job that may run for minutes is acceptable
  34. Conclusion Thursday, September 5, 13 * Full consistency is possible

    in ES, lots of trade-offs users can decide to make * Finding data is often more important and nuanced than storing it. Search is king. * RDBMS not going anywhere, neither are graph databases. Right tool yada... * ?