Upgrade to Pro — share decks privately, control downloads, hide ads and more …

elasticsearch and Tire

elasticsearch and Tire

Travis Douce

October 01, 2013
Tweet

More Decks by Travis Douce

Other Decks in Programming

Transcript

  1. • Do you need elasticsearch? • What is elasticsearch •

    What I needed to understand to be productive with Elasticsearch and Tire – Document Oriented – Restful – Searching Basics (Basic queries, Facets) What will we be talking about?
  2. • All books with “Apple” in the title • All

    book authors who have written books with “Apple” in title • Searching text for words that sound like “season” • Auto-completing a search box based on previously issued searches while accounting for mis-spellings http://exploringelasticsearch.com/book/an-overview/what-is-elasticsearch.html
  3. Solving problems for which relational databases are optimized. – Calculating

    how many items are left in the inventory – Figuring out the sum of all line-items on all the invoices sent out in a given month – Executing two operations transactionally with rollback support – Creating records that are guaranteed to be unique across multiple given terms, for instance a phone number and extension http://exploringelasticsearch.com/book/an-overview/what-is-elasticsearch.html
  4. What is Elasticsearch? • Tool for querying written words •

    Standalone database server, written in Java, that takes data in and stores it in a sophisticated format optimized for language based searches • Main protocol is implemented with HTTP/JSON
  5. http://www.elasticsearch.org/overview/ • Document oriented • Real time analytics • Distributed

    • High availability • Restful api • Multi-tenancy • Full text search • Conflict management • Schema free • Real time data • Per-operation persistence • Apache 2 open source license • Build on top of apache lucene ™ What I needed to understand to be productive
  6. • Document oriented • Restful api • Full text search

    • Schema free • Real time data What I needed to understand to be productive
  7. • Document oriented • Restful api • Full text search

    • Schema free • Real time data What I needed to understand to be productive
  8. • Document oriented • Restful api • Full text search

    • Schema free • Real time data What I needed to understand to be productive
  9. Analogy elasticsearch to RDMS Index : Database Type : table

    Type mapping: schema document : table row document : table row document : table row document : table row document : table row document : table row Type : table Type mapping: schema
  10. document : like a row in RDBMS id : like

    a primary key in RDBMS http://wac.450f.edgecastcdn.net/80450F/k2radio.com/files/2012/11/food-for-thought-630x596.jpg
  11. Field • Smallest individual unit of data • Has a

    defined type and has one or many values of that type. • Contains a single piece of data: – Like the number 42 – The string "Hello, World!" – An array like: [5, 6, 7, 8].
  12. Type • List of fields that can be specified for

    documents of that type. • Each document has a type mapping – Either user-defined, or – Inferred • Defines the types of its fields (integer, string, etc.) http://exploringelasticsearch.com/
  13. Index • The largest single unit of data is an

    index • Documents are unique per-index http://exploringelasticsearch.com/
  14. Restful • Tends to match HTTP verbs up to the

    Create, Read, Update, and Delete operations that are fundamental to most databases http://exploringelasticsearch.com/
  15. Movie Web Application • Want to be able to search

    movies by – Movie description – Movie name – Actor name • Sort results in direction • Sort results by an attribute • Facet on genre – Aggregate counts of distinct genres within the result-set
  16. Database schema • Database tables – Movies table • Name

    : “string” • description : “string” – Genres table • Name : “string” – Actors table • first_name : “string” • last_name : “string”
  17. Search Basics : Integrating Tire • Domain Specific Language (DSL)

    for Elasticsearch. • ActiveModel integration in Rails applications • At this point, ElasticSearch and Tire would just work • No type mapping provided • Elasticsearch infers type mapping • But....would only be able to search attributes on Movie model/table Include Tire Index name
  18. • When Create in database, then create a document •

    Indexed document represents a denormalized table – Do not perform joins in index Tire : Search Basics Create Elasticsearch documents. Now, we need custom type mapping Tire method
  19. • When Create, Update, Delete records in database, then do

    the same to the document in the index • Real Time Tire : Search Basics Stay in Sync
  20. elasticsearch : Search Basics • All elasticsearch queries boil down

    to the task of restricting the result set, scoring, then sorting • Searches are handled by Search API. This API has several other APIs nested inside of it: • Query DSL, • Filter API, • Facet API, and • Sort API.
  21. elasticsearch : Search Basics Custom type mapping • Search by

    • Title • Actor name • Facet on genre (“index” : “not analyzed”)
  22. elasticsearch : Search Basics Custom type mapping • Search by

    • Title • Actor name • Facet on genre (“index” : “not analyzed”) • Description
  23. elasticsearch : String Query Index Search endpoint Http verb Query

    field description for “rollerblading”
  24. elasticsearch : String Query ….but what ???? How did elasticsearch

    understand that 'rollerblading' and 'rollerblades' were related words?
  25. elasticsearch : String Query ….but what ???? How did elasticsearch

    understand that 'rollerblading' and 'rollerblades' were related words? 'snowball analyzer' in the type mapping
  26. elasticsearch : Scoring Scores the documents based on similarity to

    the query • The similarity’s value is usually known as the document’s score
  27. elasticsearch : Scoring Scores the documents based on similarity to

    the query • The similarity’s value is usually known as the document’s score
  28. elasticsearch : Query Options Other parameters and options, such as:

    • Maximum result set size, • Result offset location • Search phrases • 'fuzzy' querying ("skateboarig" matches 'skateboarding”) • 36 different query types, and 25 different filter types
  29. elasticsearch : Facets • Aggregate statistics for query results •

    Example: – Consider a user searching for movies by title – Provide aggregate counts of distinct genres within the result-set.
  30. elasticsearch : Facets • Disable analysis with "index": "not_analyzed". Otherwise:

    • "Romance" would be transformed to "romanc" • "Science Fiction" would be aggregated as two separate categories “science” and “fiction”.
  31. elasticsearch : Facets • Searches for movies with a description

    that contains the word “rollerblading”
  32. Goals : Movie Web Application • Want to be able

    to search movies by – Movie description – Movie name – Actor name • Sort results in direction • Sort results by • Facet on genre – Aggregate counts of distinct genres within the result-set
  33. • curl -X POST -d '{"query": {"match": {"_all": "story"}}}' http://localhost:9200/movie_db/_search?pretty=true

    • curl -X POST -d '{"query": {"match": {"description": "hacking"}}, "facets": {"genre": {"terms":{"field": "genre"}}}}' http://localhost:9200/movie_db/_search? pretty=true
  34. Elasticsearch Searching Steps • The first step is matching all

    documents that meet the given criteria • The second step is scoring the documents based on similarity to the query •