Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Getting Started With ElasticSearch

Getting Started With ElasticSearch

A simple overview of elasticsearch, the tire gem, and usage patterns I have discovered.

F0b14b7dbae1e90259eb946d1c841a17?s=128

Ken Collins

March 15, 2013
Tweet

Transcript

  1. Get Started Ken Collins, March 14th 2013 ElasticSearch 1

  2. Today’s Topics • About ElasticSearch 2

  3. Today’s Topics • About ElasticSearch • Ruby Libraries 2

  4. Today’s Topics • About ElasticSearch • Ruby Libraries • How

    You Might Use ElasticSearch 2
  5. Today’s Topics • About ElasticSearch • Ruby Libraries • How

    You Might Use ElasticSearch • Other Uses 2
  6. 3 About ElasticSearch

  7. About ElasticSearch 4

  8. About ElasticSearch 4 http://www.elasticsearch.org

  9. About ElasticSearch 5

  10. About ElasticSearch • Painless Setup & Use 5

  11. About ElasticSearch • Painless Setup & Use • Schema Free

    & Documented Oriented 5
  12. About ElasticSearch • Painless Setup & Use • Schema Free

    & Documented Oriented –Index data using JSON over HTTP. 5
  13. $ curl -XPOST http://localhost:9200/twitter/tweet/1 -d '{ "user": "metaskills", "post_date": "2013-03-07T13:12:00",

    "message": "Trying out elasticsearch" }' About ElasticSearch 6
  14. About ElasticSearch • Painless Setup & Use • Schema Free

    & Documented Oriented –Index data using JSON over HTTP. 7
  15. About ElasticSearch • Painless Setup & Use • Schema Free

    & Documented Oriented –Index data using JSON over HTTP. –Schema mappings. 7
  16. About ElasticSearch • Painless Setup & Use • Schema Free

    & Documented Oriented –Index data using JSON over HTTP. –Schema mappings. • Build For The Cloud. Multi-Tenant. 7
  17. About ElasticSearch • Painless Setup & Use • Schema Free

    & Documented Oriented –Index data using JSON over HTTP. –Schema mappings. • Build For The Cloud. Multi-Tenant. –Not coupled to a database. 7
  18. Index => Database Mappings/Document => Table Fields/Values => Columns/Row About

    ElasticSearch 8
  19. About ElasticSearch • Painless Setup & Use • Schema Free

    & Documented Oriented –Index data using JSON over HTTP. –Schema mappings. • Build For The Cloud. Multi-Tenant. –Not coupled to a database. 9
  20. About ElasticSearch • Painless Setup & Use • Schema Free

    & Documented Oriented –Index data using JSON over HTTP. –Schema mappings. • Build For The Cloud. Multi-Tenant. –Not coupled to a database. –Distributed Nodes & Shards 9
  21. About ElasticSearch • Painless Setup & Use • Schema Free

    & Documented Oriented –Index data using JSON over HTTP. –Schema mappings. • Build For The Cloud. Multi-Tenant. –Not coupled to a database. –Distributed Nodes & Shards –Gateway - Time Machine For Search. 9
  22. About ElasticSearch • Great Features 10

  23. About ElasticSearch • Great Features –Facets 10

  24. About ElasticSearch • Great Features –Facets –Highlighting 10

  25. About ElasticSearch • Great Features –Facets –Highlighting –Geo Location 10

  26. About ElasticSearch • Great Features –Facets –Highlighting –Geo Location –Custom

    Scripts 10
  27. About ElasticSearch • Great Features –Facets –Highlighting –Geo Location –Custom

    Scripts • Open Source 10
  28. About ElasticSearch • Great Features –Facets –Highlighting –Geo Location –Custom

    Scripts • Open Source –Apache 2 License 10
  29. About ElasticSearch • Great Features –Facets –Highlighting –Geo Location –Custom

    Scripts • Open Source –Apache 2 License –Hosted On Github 10
  30. ElasticSearch Guides 11

  31. http://www.elasticsearch.org/guide/reference/api/index_.html http://www.elasticsearch.org/guide/reference/api/admin-indices-create-index.html ElasticSearch Guides (Index) 12

  32. $ curl -XPUT http://localhost:9200/twitter/tweet/1 -d '{ "user": "metaskills", "post_date": "2013-03-07T13:12:00",

    "message": "Trying out elasticsearch" }' ElasticSearch Guides (Index) 13 $ curl -XPOST http://localhost:9200/twitter/ -d '{ "settings": {"number_of_shards": 10}, "mappings": { "twitter_card": { "_source": {"enabled": false}, "properties": { "title": {"type": "string", "index": "not_analyzed"} } } } }' “1” Left off for auto ID.
  33. http://www.elasticsearch.org/guide/reference/mapping/ http://www.elasticsearch.org/guide/reference/api/admin-indices-put-mapping.html ElasticSearch Guides (Mappings) 14

  34. ElasticSearch Guides (Mappings) 15 “Mapping is the process of defining

    how a document should be mapped to the Search Engine, including its searchable characteristics such as which fields are searchable and if or how they are tokenized.” $ curl -XPUT http://localhost:9200/twitter/tweet/_mapping -d '{ "tweet": { "properties": { "message": {"type": "string", "store": "yes"} } } }'
  35. http://www.elasticsearch.org/guide/reference/index-modules/analysis/ ElasticSearch Guides (Analysis) 16

  36. “The Brown-Cow's Part_No. #A.BC123-456 joe@bloggs.com” ElasticSearch Guides (Analysis) 17 keyword:

    The Brown-Cow's Part_No. #A.BC123-456 joe@bloggs.com whitespace: The, Brown-Cow's, Part_No., #A.BC123-456, joe@bloggs.com simple: the, brown, cow, s, part, no, a, bc, joe, bloggs, com standard: brown, cow's, part_no, a.bc123, 456, joe, bloggs.com snowball (English): brown, cow, part_no, a.bc123, 456, joe, bloggs.com
  37. ElasticSearch Guides (Analysis) 18 • Analyzers • Standard • Simple

    • Whitespace • Stop • Keyword • Pattern • Language • Snowball • Custom • Tokenizers • Edge NGram • Keyword • Letter • Lowercase • NGram • Standard • Whitespace • Pattern • UAX URL Email • Path Hierarchy • Token Filter • Standard • ASCII Folding • Length • Lowercase • NGram • Edge NGram • Porter Stem • Shingle • Stop • Word Delimiter • Stemmer • Stemmer Ovrd. • Keyword Mkr. • KStem • Snowball • Phonetic • Synonym • Compound Word • Reverse • Elision • Truncate • Unique • Pattern Replace • Trim • Char Filter • Mapping • HTML Strip • Plugin • ICU
  38. ElasticSearch Guides (Analysis) 18 • Analyzers • Standard • Simple

    • Whitespace • Stop • Keyword • Pattern • Language • Snowball • Custom • Tokenizers • Edge NGram • Keyword • Letter • Lowercase • NGram • Standard • Whitespace • Pattern • UAX URL Email • Path Hierarchy • Token Filter • Standard • ASCII Folding • Length • Lowercase • NGram • Edge NGram • Porter Stem • Shingle • Stop • Word Delimiter • Stemmer • Stemmer Ovrd. • Keyword Mkr. • KStem • Snowball • Phonetic • Synonym • Compound Word • Reverse • Elision • Truncate • Unique • Pattern Replace • Trim • Char Filter • Mapping • HTML Strip • Plugin • ICU
  39. ElasticSearch Guides (Analysis) 18 • Analyzers • Standard • Simple

    • Whitespace • Stop • Keyword • Pattern • Language • Snowball • Custom • Tokenizers • Edge NGram • Keyword • Letter • Lowercase • NGram • Standard • Whitespace • Pattern • UAX URL Email • Path Hierarchy • Token Filter • Standard • ASCII Folding • Length • Lowercase • NGram • Edge NGram • Porter Stem • Shingle • Stop • Word Delimiter • Stemmer • Stemmer Ovrd. • Keyword Mkr. • KStem • Snowball • Phonetic • Synonym • Compound Word • Reverse • Elision • Truncate • Unique • Pattern Replace • Trim • Char Filter • Mapping • HTML Strip • Plugin • ICU
  40. http://www.elasticsearch.org/guide/reference/api/search/ ElasticSearch Guides (Search API) 19

  41. http://www.elasticsearch.org/guide/reference/query-dsl/ ElasticSearch Guides (Query DSL) 20

  42. ElasticSearch Guides (Query DSL) 21 • Queries • match •

    multi_match • bool • boosting • ids • custom_score • custom_boost_factor • constant_score • dis_max • field • filtered • flt • flt_field • fuzzy • has_child • has_parent • match_all • mlt • mlt_field • prefix • query_string • range • span_first • span_near • span_not • span_or • span_term • term • terms • top_children • wildcard • nested • custom_filters_score • indices • text • geo_shape • Filters • and • bool • exists • ids • limit • type • geo_bbox • geo_distance • geo_distance_range • geo_polygon • geo_shape • has_child • has_parent • match_all • missing • not • numeric_range • or • prefix • query • range • script • term • terms • nested
  43. ElasticSearch Guides (Query DSL) 21 • Queries • match •

    multi_match • bool • boosting • ids • custom_score • custom_boost_factor • constant_score • dis_max • field • filtered • flt • flt_field • fuzzy • has_child • has_parent • match_all • mlt • mlt_field • prefix • query_string • range • span_first • span_near • span_not • span_or • span_term • term • terms • top_children • wildcard • nested • custom_filters_score • indices • text • geo_shape • Filters • and • bool • exists • ids • limit • type • geo_bbox • geo_distance • geo_distance_range • geo_polygon • geo_shape • has_child • has_parent • match_all • missing • not • numeric_range • or • prefix • query • range • script • term • terms • nested
  44. Followup Resources 22 Clinton Gormley @clintongormley Terms of Endearment http://www.slideshare.net/clintongormley/terms-of-endearment-the-elasticsearch-query-dsl-explained

    Cool, Bonsai, Cool http://www.slideshare.net/clintongormley/cool-bonsai-cool-an-introduction-to-elasticsearch
  45. 23 Ruby Libraries

  46. 23 Ruby Libraries

  47. Ruby Libraries (Tire) 24

  48. Ruby Libraries (Tire) 24 • DSL To ElasticSearch

  49. Ruby Libraries (Tire) 24 • DSL To ElasticSearch • Declarative

    Block or Imperative Style
  50. Tire.index 'articles' do delete create :mappings => { :article =>

    { :properties => { :id => { :type => 'string', :index => 'not_analyzed' }, :title => { :type => 'string', :boost => 2.0 }, :tags => { :type => 'string', :analyzer => 'keyword' }, :content => { :type => 'string', :analyzer => 'snowball' } } } } store :title => 'One', :tags => ['ruby'] store :title => 'Two', :tags => ['ruby', 'python'] store :title => 'Three', :tags => ['java'] refresh end Ruby Libraries (Tire) 25
  51. index = Tire::Index.new('oldskool') index.delete index.create index.store :title => "Let's do

    it the old way!" index.refresh Ruby Libraries (Tire) 26
  52. Ruby Libraries (Tire) 27 • DSL To ElasticSearch • Declarative

    Block or Imperative Style
  53. Ruby Libraries (Tire) 27 • DSL To ElasticSearch • Declarative

    Block or Imperative Style • ActiveModel Integration
  54. class Article < ActiveRecord::Base include Tire::Model::Search include Tire::Model::Callbacks mapping do

    indexes :id, :index => :not_analyzed indexes :title, :analyzer => 'snowball', :boost => 100 indexes :content, :analyzer => 'snowball' indexes :content_size, :as => 'content.size' indexes :author, :analyzer => 'keyword' indexes :published_on, :type => 'date', :include_in_all => false end after_save { update_index if state == 'published' } def to_indexed_json attributes.slice(...).to_json end end Ruby Libraries (Tire) 28
  55. Ruby Libraries (Tire) 28

  56. Ruby Libraries (Tire) 29 • DSL To ElasticSearch • Declarative

    Block or Imperative Style • ActiveModel Integration
  57. Ruby Libraries (Tire) 29 • DSL To ElasticSearch • Declarative

    Block or Imperative Style • ActiveModel Integration • Contributed Components Gem
  58. Tire.search 'articles' do query { string 'title:T*' } filter :terms,

    :tags => ['ruby'] sort { by :title, 'desc' } facet('global-tags', :global => true) { terms :tags } facet('current-tags') { terms :tags } end Ruby Libraries (Tire) 30
  59. Tire.search 'articles' do query { string 'title:T*' } filter :terms,

    :tags => ['ruby'] sort { by :title, 'desc' } facet('global-tags', :global => true) { terms :tags } facet('current-tags') { terms :tags } end Ruby Libraries (Tire) 30 { "fields": ["name", "shortDescription", "longDescription"], "query": { "query_string": { "fields": ["name"], "query": "+camera +laptop", "use_dis_max": true } } }
  60. Tire.search({ fields: ["name", "shortDescription", "longDescription"], query: { query_string: { fields:

    ["name"], query: "+camera +laptop", use_dis_max: true } } }) Ruby Libraries (Tire) 31
  61. Ruby Libraries (Tire) 32 • DSL To ElasticSearch • Declarative

    Block or Imperative Style • ActiveModel Integration • Contributed Components Gem
  62. Ruby Libraries (Tire) 32 • DSL To ElasticSearch • Declarative

    Block or Imperative Style • ActiveModel Integration • Contributed Components Gem • Tire::Search::Search#to_curl
  63. https://github.com/karmi/tire https://github.com/karmi/tire-contrib/ Ruby Libraries (Tire) 33

  64. How You Might Use ElasticSearch 34

  65. How You Might Use ElasticSearch 34 Search Service

  66. ? Service Usage Concept (service-ext) 35

  67. Account Service Usage Concept (service-ext) 35

  68. Account Service Usage Concept (service-ext) 35 /search/full_name?query=...

  69. Account Service Usage Concept (service-ext) 35 /search/organization?query=... /search/full_name?query=...

  70. Account Service Usage Concept (service-ext) 35 /search/organization?query=... /search/address?query=... /search/full_name?query=...

  71. Account Service Usage Concept (service-ext) 35 /search/organization?query=... /search/email?query=... /search/address?query=... /search/full_name?query=...

  72. Account Service Usage Concept (service-ext) 35 /search/organization?query=... /search/email?query=... /search/address?query=... /search?query=...

    /search/full_name?query=...
  73. Account Service Usage Concept (service-ext) 35 /search/organization?query=... /search/email?query=... /search/address?query=... /search?query=...

    /search/full_name?query=...
  74. Account Service Usage Concept (service-ext) 35 /search/organization?query=... /search/email?query=... /search/address?query=... /search?query=...

    /search/full_name?query=... • Cluster
  75. Account Service Usage Concept (service-ext) 35 /search/organization?query=... /search/email?query=... /search/address?query=... /search?query=...

    /search/full_name?query=... • Cluster • Nodes
  76. Account Service Usage Concept (service-ext) 35 /search/organization?query=... /search/email?query=... /search/address?query=... /search?query=...

    /search/full_name?query=... Index ? Index ? Index ? • Cluster • Nodes
  77. Account Service Usage Concept (service-ext) 35 /search/organization?query=... /search/email?query=... /search/address?query=... /search?query=...

    /search/full_name?query=... Index ? Index ? Index ? • Cluster • Nodes • Settings
  78. Account Service Usage Concept (service-ext) 35 /search/organization?query=... /search/email?query=... /search/address?query=... /search?query=...

    /search/full_name?query=... Index ? Index ? Index ? • Cluster • Nodes • Settings • Storage
  79. Account Service Usage Concept (service-ext) 35 /search/organization?query=... /search/email?query=... /search/address?query=... /search?query=...

    /search/full_name?query=... Index ? Index ? Index ? • Cluster • Nodes • Settings • Storage • Shards
  80. Account Service Usage Concept (service-ext) 35 /search/organization?query=... /search/email?query=... /search/address?query=... /search?query=...

    /search/full_name?query=... Index ? Index ? Index ? • Cluster • Nodes • Settings • Storage • Shards • Replicas
  81. Account Service Usage Concept (service-ext) 35 /search/organization?query=... /search/email?query=... /search/address?query=... /search?query=...

    /search/full_name?query=... Index ? Index ? Index ? • Cluster • Nodes • Settings • Storage • Shards • Replicas
  82. Usage Concept (service-ext) 36 /search/full_name?query=...

  83. Usage Concept (add-on) 37

  84. #2 App /email/search Usage Concept (add-on) 37 #1 Service #2

    Service #3 Service /search/full_name /search/address /search/text /search/subjects /search/text /search/full_name /search/address #1 App /articles/search /catalog/search
  85. #2 App /email/search Usage Concept (add-on) 38 #1 Service #3

    Service /search/full_name /search/address /search/full_name /search/address
  86. /search/address /search/full_name /search/full_name /search/address #2 App #1 Service #3 Service

    /email/search Usage Concept (add-on) 39
  87. /full_name /email Account Search /address Usage Concept (lateral-biz-need) 40

  88. Account Service /search/full_name /search/email /search/address Usage Concept (lateral-biz-need) 41

  89. Account Search /full_name /email /address Usage Concept (lateral-biz-need) 42

  90. Account Search /full_name /email /address Usage Concept (lateral-biz-need) 42 Service

    #1 App #1 App #2 Service #2
  91. Account Search /full_name /email /address Usage Concept (lateral-biz-need) 42 Service

    #1 App #1 App #2 Service #2 Standard Document Representation
  92. Usage Concept (lateral-biz-need) 43 #2 App #1 Service #2 Service

    #3 Service #1 App
  93. Usage Concept (lateral-biz-need) 43 #2 App #1 Service #2 Service

    #3 Service #1 App $ curl -XGET http://localhost:9200/foo,bar/tweet/_search?q=tag:wow
  94. Usage Concept (lateral-biz-need) 43 #2 App #1 Service #2 Service

    #3 Service #1 App $ curl -XGET http://localhost:9200/test/_msearch --data-binary @requests {"index" : "test"} {"query" : {"match_all" : {}}, "from" : 0, "size" : 10} {"index" : "test", "search_type" : "count"} {"query" : {"match_all" : {}}} {} {"query" : {"match_all" : {}}} {"query" : {"match_all" : {}}} {"search_type" : "count"} {"query" : {"match_all" : {}}}
  95. Usage Concept (lateral-biz-need) 43 #2 App #1 Service #2 Service

    #3 Service #1 App http://www.elasticsearch.org/guide/reference/api/search/indices-types.html http://www.elasticsearch.org/guide/reference/api/multi-search.html
  96. http://search.mycompany.com/ I’m Feeling Lucky 44

  97. ElasticSearch FTW! 45 #2 Service #2 App #1 App #1

    Service Search Service #1 #3 Service
  98. Now that we got our hands dirty... Other Usages 46

  99. http://www.logstash.net Other Usages (logstash) 47

  100. http://www.logstash.net/docs/1.1.9/outputs/elasticsearch Other Usages (logstash) 48

  101. http://blog.trifork.com/2013/01/10/how-to-write-an-elasticsearch-river-plugin/ Other Usages (rivers) 49

  102. http://blog.bugsense.com/post/35580279634/indexing-bigdata-with-elasticsearch Other Usages (big data) 50

  103. Get Started Ken Collins, March 14th 2013 Thank You! 51

    Go Search