Upgrade to Pro — share decks privately, control downloads, hide ads and more …

You're not using Elasticsearch?

You're not using Elasticsearch?

An introduction to Elasticsearch 1.0 and how Tolq uses it for its translation memory

Timon Vonk

March 13, 2014
Tweet

Other Decks in Technology

Transcript

  1. About me • CTO @ Tolq, zero effort website translations

    • Freelance hacker and consultant
  2. What is ElasticSearch? • Search engine • Cloud in mind

    • JSON API • Scriptable • Nosql • Great for things different than search
  3. Why use it? • Cloud setup out of the box

    • Fast indexing • Easy API • On the fly mappings • Very customisable
  4. OMG WHAT IS LUCENE • Java library for search •

    Only handles the search bit • Terms based vector algorithm
  5. Making a query • Just send JSON: GET localhost:9200/example/peanuts/_search
 {

    ‘query’: { text: { ‘my_field’: ’many search terms’ }}} { took: 5, timed_out: false,
 _shards: { total: 5, successful: 5, failed: 0 },
 hits: [
 { _index: “example”,
 _type: “peanuts”,
 _score: 0.9,
 _source: { …data }
 }
 ]
 }
 }
  6. Other types of queries • Terms, full text, boolean, fuzzy,

    geolocation and lots more variants • Filters, aggregations, percolation, suggestions
  7. Analysing • Pre-index and pre-search • This is when scoring

    happens • Remove stop words, stemming, other normalisations • You can create your own analysers
  8. Aggregations { “query”: … } { “aggregations”: { “rubyist_stats”: {

    “stats”: { “field”: “meetup_visits” } } } } Aggregations are an upgrade over < 1.0 facets
  9. Different kinds • min, max, avg, sum • stats, extended_stats

    (all of the above + stdev/ mean) • percentile • counts • … all scriptable!
  10. (Ruby) libraries • Good old Tire
 deprecated • Stretcher! •

    elasticsearch-ruby • Also libraries for Go, Node, Javascript, etc • … it’s just json
  11. How Tolq uses ElasticSearch • Suggest better translations for translators

    • Fast access to text and translations for general search