Slide 1

Slide 1 text

elasticsearch and Tire

Slide 2

Slide 2 text

● Travis Douce ● [email protected][email protected]

Slide 3

Slide 3 text

Elasticsearch Tire

Slide 4

Slide 4 text

● Do you need elasticsearch? What will we be talking about?

Slide 5

Slide 5 text

● Do you need elasticsearch? ● What is elasticsearch What will we be talking about?

Slide 6

Slide 6 text

● Do you need elasticsearch? ● What is elasticsearch ● What I needed to understand to be productive with Elasticsearch and Tire – Document Oriented – Restful – Searching Basics (Basic queries, Facets) What will we be talking about?

Slide 7

Slide 7 text

Do I need elasticsearch? http://diabetesadvocacycom.blogspot.com/2012/03/i-dont-know.html

Slide 8

Slide 8 text

● All books with “Apple” in the title ● All book authors who have written books with “Apple” in title ● Searching text for words that sound like “season” ● Auto-completing a search box based on previously issued searches while accounting for mis-spellings http://exploringelasticsearch.com/book/an-overview/what-is-elasticsearch.html

Slide 9

Slide 9 text

Solving problems for which relational databases are optimized. – Calculating how many items are left in the inventory – Figuring out the sum of all line-items on all the invoices sent out in a given month – Executing two operations transactionally with rollback support – Creating records that are guaranteed to be unique across multiple given terms, for instance a phone number and extension http://exploringelasticsearch.com/book/an-overview/what-is-elasticsearch.html

Slide 10

Slide 10 text

Who indexes anyway?

Slide 11

Slide 11 text

What is Elasticsearch? ● Tool for querying written words ● Standalone database server, written in Java, that takes data in and stores it in a sophisticated format optimized for language based searches ● Main protocol is implemented with HTTP/JSON

Slide 12

Slide 12 text

Lucene Elasticsearch

Slide 13

Slide 13 text

http://www.elasticsearch.org/overview/ ● Document oriented ● Real time analytics ● Distributed ● High availability ● Restful api ● Multi-tenancy ● Full text search ● Conflict management ● Schema free ● Real time data ● Per-operation persistence ● Apache 2 open source license ● Build on top of apache lucene ™ What I needed to understand to be productive

Slide 14

Slide 14 text

● Document oriented ● Restful api ● Full text search ● Schema free ● Real time data What I needed to understand to be productive

Slide 15

Slide 15 text

● Document oriented ● Restful api ● Full text search ● Schema free ● Real time data What I needed to understand to be productive

Slide 16

Slide 16 text

● Document oriented ● Restful api ● Full text search ● Schema free ● Real time data What I needed to understand to be productive

Slide 17

Slide 17 text

1. Document Oriented JSON : The Language of Elasticsearch

Slide 18

Slide 18 text

Analogy elasticsearch to RDMS Index : Database Type : table Type mapping: schema document : table row document : table row document : table row document : table row document : table row document : table row Type : table Type mapping: schema

Slide 19

Slide 19 text

Documents

Slide 20

Slide 20 text

Documents id

Slide 21

Slide 21 text

document : like a row in RDBMS id : like a primary key in RDBMS http://wac.450f.edgecastcdn.net/80450F/k2radio.com/files/2012/11/food-for-thought-630x596.jpg

Slide 22

Slide 22 text

Field Field

Slide 23

Slide 23 text

Field Field

Slide 24

Slide 24 text

Field ● Smallest individual unit of data ● Has a defined type and has one or many values of that type. ● Contains a single piece of data: – Like the number 42 – The string "Hello, World!" – An array like: [5, 6, 7, 8].

Slide 25

Slide 25 text

Type ● List of fields that can be specified for documents of that type. ● Each document has a type mapping – Either user-defined, or – Inferred ● Defines the types of its fields (integer, string, etc.) http://exploringelasticsearch.com/

Slide 26

Slide 26 text

Type : analogous to a database table RDBMS http://wac.450f.edgecastcdn.net/80450F/k2radio.com/files/2012/11/food-for-thought-630x596.jpg

Slide 27

Slide 27 text

Data Types

Slide 28

Slide 28 text

Example Type Mapping

Slide 29

Slide 29 text

Example Type Mapping

Slide 30

Slide 30 text

Example Type Mapping

Slide 31

Slide 31 text

Example Type Mapping

Slide 32

Slide 32 text

Example Type Mapping

Slide 33

Slide 33 text

Index ● The largest single unit of data is an index ● Documents are unique per-index http://exploringelasticsearch.com/

Slide 34

Slide 34 text

Index Type Type

Slide 35

Slide 35 text

Indexes : analogous to a database in traditional RDBMS http://wac.450f.edgecastcdn.net/80450F/k2radio.com/files/2012/11/food-for-thought-630x596.jpg

Slide 36

Slide 36 text

Restful ● Tends to match HTTP verbs up to the Create, Read, Update, and Delete operations that are fundamental to most databases http://exploringelasticsearch.com/

Slide 37

Slide 37 text

Restful examples Create Index Create Document Retrieve Document Update Document Delete Document

Slide 38

Slide 38 text

Movie Web Application ● Want to be able to search movies by – Movie description – Movie name – Actor name ● Sort results in direction ● Sort results by an attribute ● Facet on genre – Aggregate counts of distinct genres within the result-set

Slide 39

Slide 39 text

Database schema ● Database tables – Movies table ● Name : “string” ● description : “string” – Genres table ● Name : “string” – Actors table ● first_name : “string” ● last_name : “string”

Slide 40

Slide 40 text

1 1 1 many many

Slide 41

Slide 41 text

1 many 1 many

Slide 42

Slide 42 text

Elasticsearch : Search Basics Create Index

Slide 43

Slide 43 text

Search Basics : Integrating Tire ● Domain Specific Language (DSL) for Elasticsearch. ● ActiveModel integration in Rails applications ● At this point, ElasticSearch and Tire would just work ● No type mapping provided ● Elasticsearch infers type mapping ● But....would only be able to search attributes on Movie model/table Include Tire Index name

Slide 44

Slide 44 text

elasticsearch : Search Basics ● Create 4 Documents

Slide 45

Slide 45 text

● When Create in database, then create a document ● Indexed document represents a denormalized table – Do not perform joins in index Tire : Search Basics Create Elasticsearch documents. Now, we need custom type mapping Tire method

Slide 46

Slide 46 text

● When Create, Update, Delete records in database, then do the same to the document in the index ● Real Time Tire : Search Basics Stay in Sync

Slide 47

Slide 47 text

elasticsearch : Search Basics ● All elasticsearch queries boil down to the task of restricting the result set, scoring, then sorting ● Searches are handled by Search API. This API has several other APIs nested inside of it: ● Query DSL, ● Filter API, ● Facet API, and ● Sort API.

Slide 48

Slide 48 text

elasticsearch : Search Basics Custom type mapping ● Search by ● Title

Slide 49

Slide 49 text

elasticsearch : Search Basics Custom type mapping ● Search by ● Title ● Actor name

Slide 50

Slide 50 text

elasticsearch : Search Basics Custom type mapping ● Search by ● Title ● Actor name ● Facet on genre (“index” : “not analyzed”)

Slide 51

Slide 51 text

elasticsearch : Search Basics Custom type mapping ● Search by ● Title ● Actor name ● Facet on genre (“index” : “not analyzed”) ● Description

Slide 52

Slide 52 text

Tire : Search Basics Custom Type Mapping

Slide 53

Slide 53 text

elasticsearch : String Query Index Search endpoint Http verb Query field description for “rollerblading”

Slide 54

Slide 54 text

Tire : String Query

Slide 55

Slide 55 text

Tire : String Query Make search dynamic: query_string → 'rollerblading'

Slide 56

Slide 56 text

elasticsearch : String Query

Slide 57

Slide 57 text

elasticsearch : String Query ….but what ???? How did elasticsearch understand that 'rollerblading' and 'rollerblades' were related words?

Slide 58

Slide 58 text

elasticsearch : String Query ….but what ???? How did elasticsearch understand that 'rollerblading' and 'rollerblades' were related words? 'snowball analyzer' in the type mapping

Slide 59

Slide 59 text

elasticsearch : String Query Remember, the analyzer defined in the custom 'type mapping'!

Slide 60

Slide 60 text

elasticsearch : Snowball Analyzer “rollerblading”, “rollerblader”, and “rollerbladed” has been cut down to its stem, “rollerblad”

Slide 61

Slide 61 text

elasticsearch : Scoring Scores the documents based on similarity to the query ● The similarity’s value is usually known as the document’s score

Slide 62

Slide 62 text

elasticsearch : Scoring Scores the documents based on similarity to the query ● The similarity’s value is usually known as the document’s score

Slide 63

Slide 63 text

elasticsearch : Results It's JSON !!!!!

Slide 64

Slide 64 text

elasticsearch : Results It's JSON !!!!! score

Slide 65

Slide 65 text

elasticsearch : Results It's JSON !!!!! score Number of hits

Slide 66

Slide 66 text

elasticsearch : Results It's JSON !!!!! score Number of hits Entire document returned

Slide 67

Slide 67 text

elasticsearch : Query Options Other parameters and options, such as: ● Maximum result set size, ● Result offset location ● Search phrases ● 'fuzzy' querying ("skateboarig" matches 'skateboarding”) ● 36 different query types, and 25 different filter types

Slide 68

Slide 68 text

Tire : Query Options

Slide 69

Slide 69 text

Tire : Dynamic Query Options Sort results: Sort direction Sort on Result set size Make search dynamic

Slide 70

Slide 70 text

3.4 Facets

Slide 71

Slide 71 text

elasticsearch : Facets ● Aggregate statistics for query results ● Example: – Consider a user searching for movies by title – Provide aggregate counts of distinct genres within the result-set.

Slide 72

Slide 72 text

elasticsearch : Facets ● Disable analysis with "index": "not_analyzed". Otherwise: ● "Romance" would be transformed to "romanc" ● "Science Fiction" would be aggregated as two separate categories “science” and “fiction”.

Slide 73

Slide 73 text

elasticsearch : Facets ● Searches for movies with a description that contains the word “rollerblading”

Slide 74

Slide 74 text

elasticsearch : Facets ● Results contains another 'key', 'facets' ● 3 facets

Slide 75

Slide 75 text

Tire : Facets Facet name Facet on

Slide 76

Slide 76 text

Other Popular functionalities ● Filters ● Highliting ● Pagination ● Search suggest ● Many more

Slide 77

Slide 77 text

Goals : Movie Web Application ● Want to be able to search movies by – Movie description – Movie name – Actor name ● Sort results in direction ● Sort results by ● Facet on genre – Aggregate counts of distinct genres within the result-set

Slide 78

Slide 78 text

Goals Achieved! Search by Actor name Movie description Movie name Sort on Sort by Facet on genre

Slide 79

Slide 79 text

Helpful bits of technology ● https://github.com/mobz/elasticsearch-head ● https://httpkit.com/resources/HTTP-from-the-Comm

Slide 80

Slide 80 text

Similar Technologies ● Solr ● Sphinx

Slide 81

Slide 81 text

Resources ● https://httpkit.com/resources/HTTP-from-the-Command-Line/#make-a-request ● http://exploringelasticsearch.com/book/an-overview/what-is-elasticsearch.htmL ● https://github.com/karmi/retire

Slide 82

Slide 82 text

● curl -X POST -d '{"query": {"match": {"_all": "story"}}}' http://localhost:9200/movie_db/_search?pretty=true ● curl -X POST -d '{"query": {"match": {"description": "hacking"}}, "facets": {"genre": {"terms":{"field": "genre"}}}}' http://localhost:9200/movie_db/_search? pretty=true

Slide 83

Slide 83 text

Elasticsearch Searching Steps ● The first step is matching all documents that meet the given criteria ● The second step is scoring the documents based on similarity to the query ●