Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Elasticsearch - Text to Fulltext in 1 hour

Elasticsearch - Text to Fulltext in 1 hour

Walkthrough of the major features of elasticsearch

Mark van Straten

August 09, 2014
Tweet

More Decks by Mark van Straten

Other Decks in Programming

Transcript

  1. How does full text searching work The document’s text is

    put in an inverted index doc1 = “no limit, no boundaries” doc2 = “no limit made music” $q=boundaries => doc1 $q=limit => doc1, doc2 no 1,1 2 limit 1 2 boundaries 1 made 2 music 2
  2. Tokenizers Tokenizers are used to break a string down into

    a stream of terms or tokens. A simple tokenizer might split the string up into terms wherever it encounters whitespace or punctuation. “The|quick|brown|fox…”
  3. TokenFilters Removing words based on given criteria. Commonly used: StopTokenFilter

    “We are changing the world with the full text searching capabilities of elasticsearch.” (culture aware, configurable)
  4. Create a new (C#) project File -> New -> Console

    application Install the C# Client library NEST from NuGet $Install-package -Id Nest
  5. Index mappings Default all is analyzed. Some fields [url] might

    not need that. private static PutMappingDescriptor<MyObject> mapStaatberichtForIndex(PutMappingDescriptor<MyObject> obj) { return obj .Properties(p => p .String(n => n.Name(nn => nn.Name).Index(FieldIndexOption.NotAnalyzed)) .String(n => n.Name(nn => nn.Url).Index(FieldIndexOption.NotAnalyzed)) ); }
  6. Searching Searching will return results with a score, lower is

    less-relevant. var queryStringQuery = client.Search<MyObject>(s => s.QueryString("Lightning fast ")); Console.WriteLine("Found " + queryStringQuery.Total + " results, first 10:"); foreach (var doc in queryStringQuery.Documents) { Console.WriteLine(" - Found: " + doc.Url); }
  7. Filtering Filtering is binary (yes/no) but gives no scoring. (FAST!

    just the inverted index) var filteredItems = client.Search<MyObject>(s => s.Filter(f => f.Term(t => t.Name, "IronMan"))); foreach (var doc in filteredItems.Documents) { Console.WriteLine(" - Found: " + doc.Url); }
  8. Aggregations (previously: Facets) var facetsSearch = client.Search<MyObject>(s => s .FacetTerm(t

    => t.OnField(tf => tf.Category)) ); foreach (var facet in facetsSearch.Facets) { Console.WriteLine("Facet: " + facet.Key); foreach (var facetValue in (facet.Value as TermFacet).Items) Console.WriteLine(" - Option: " + facetValue.Term); }
  9. Suggest var suggestQuery = client.Search<MyObject>(s => s .SuggestTerm("suggestterm", sg =>

    sg.OnField(sgf => sgf.TextBody).Text("fastest man alive")) ); foreach (var suggest in suggestQuery.Suggest["suggestterm"]) { Console.WriteLine(" - Maybe you mean: " + suggest.Text); }