Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Elasticsearch for SQL Users

Elasticsearch for SQL Users

As presented at Great Wide Open 2016

Shaunak Kashyap

March 16, 2016
Tweet

More Decks by Shaunak Kashyap

Other Decks in Technology

Transcript

  1. 2 The Elastic Stack Elasticsearch Store, Index & Analyze Kibana

    User Interface Security Monitoring Alerting Plugins Logstash Beats Ingest Elastic Cloud: Elasticsearch as a Service Hosted Service
  2. 7 CREATE TABLE IF NOT EXISTS emails ( sender VARCHAR(255)

    NOT NULL, recipients TEXT, cc TEXT, bcc TEXT, subject VARCHAR(1024), body MEDIUMTEXT, datetime DATETIME ); CREATE INDEX emails_sender ON emails(sender); CREATE FULLTEXT INDEX emails_subject ON emails(subject); CREATE FULLTEXT INDEX emails_body ON emails(body); curl -XPOST 'http://localhost:9200/enron' -d' { "mappings": { "email": { "properties": { "sender": { "type": "string", "index": "not_analyzed" }, "recipients": { "type": "string", "index": "not_analyzed" }, "cc": { "type": "string", "index": "not_analyzed" }, "bcc": { "type": "string", "index": "not_analyzed" }, "subject": { "type": "string", "analyzer": "english" }, "body": { "type": "string", "analyzer": "english" } } } } Schemas
  3. 9 [LIVE DEMO] • Search for text in a single

    field • Search for text in multiple fields • Search for a phrase https://github.com/ycombinator/es-enron
  4. 10 Other Search Features Stemming Synonyms Did you mean? •

    Jump, jumped, jumping • Queen, monarch • Monetery => Monetary
  5. 12 To analyze or not to analyze? PUT cities/city/1 {

    "city": "Atlanta", "population": 447841 } PUT cities/city/2 { "city": "New Albany", "population": 8829 } PUT cities/city/3 { "city": "New York", "population": 8406000 } POST cities/_search { "query": { "match": { "city": "New Albany" } } } QUERY + = ?
  6. 13 To analyze or not to analyze? PUT cities/city/1 {

    "city": "Atlanta", "population": 447841 } PUT cities/city/2 { "city": "New Albany", "population": 8829 } PUT cities/city/3 { "city": "New York", "population": 8406000 } Term Document IDs Albany 2 New 2,3 Atlanta 1 York 3
  7. 14 To analyze or not to analyze? PUT cities {

    "mappings": { "city": { "properties": { "city": { "type": "string", "index": "not_analyzed" } } } } } MAPPING Term Document IDs New Albany 2 New York 3 Atlanta 1
  8. PUT blog/post/1 { "author_id": 1, "title": "...", "body": "..." }

    PUT blog/post/2 { "author_id": 1, "title": "...", "body": "..." } PUT blog/post/3 { "author_id": 1, "title": "...", "body": "..." } 15 Relationships: Application-side joins PUT blog/author/1 { "name": "John Doe", "bio": "..." } POST blog/author/_search { "query": { "match": { "name": "John" } } } QUERY 1 POST blog/post/_search { "query": { "match": { "author_id": <each id from query 1 result> } } } QUERY 2
  9. PUT blog/post/1 { "author_name": "John Doe", "title": "...", "body": "..."

    } PUT blog/post/2 { "author_name": "John Doe", "title": "...", "body": "..." } 16 Relationships: Data denormalization POST blog/post/_search { "query": { "match": { "author_name": "John" } } } QUERY PUT blog/post/3 { "author_name": "John Doe", "title": "...", "body": "..." }
  10. 17 Relationships: Nested objects PUT blog/author/1 { "name": "John Doe",

    "bio": "...", "blog_posts": [ { "title": "...", "body": "..." }, { "title": "...", "body": "..." }, { "title": "...", "body": "..." } ] } POST blog/author/_search { "query": { "match": { "name": "John" } } } QUERY
  11. 18 Relationships: Parent-child documents PUT blog/author/1 { "name": "John Doe",

    "bio": "..." } POST blog/post/_search { "query": { "has_parent": { "type": "author", "query": { "match": { "name": "John" } } } QUERY PUT blog { "mappings": { "author": {}, "post": { "_parent": { "type": "author" } } } } PUT blog/post/1?parent=1 { "title": "...", "body": "..." } PUT blog/post/2?parent=1 { "title": "...", "body": "..." } PUT blog/post/3?parent=1 { "title": "...", "body": "..." }
  12. 21 Async replication to Elasticsearch 1 2 3 ESSynchronizer flow

    by Yamini Ahluwalia from the Noun Project
  13. 24 Forked writes from application (more robust) 1 2 queue

    by Huu Nguyen from the Noun Project ESSynchronizer 3 4