Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Elasticsearch - Why big system need you

Elasticsearch - Why big system need you

Elasticsearch is a awesome, flexible and powerful open source, distributed real-time search and analytics engine for the cloud

Trần Kim Hiếu

October 26, 2013
Tweet

More Decks by Trần Kim Hiếu

Other Decks in Programming

Transcript

  1. Elasticsearch Why big system need you by Trần Kim Hiếu

    a Mobile & Web Developer @ Silicon Straits Saigon
  2. Searching is hard and important Functional requirements Find the right

    things (effectivity) Non-functional requirements Find the things right (efficiency) 4QFFEJTVTFMFTTXJUIPVUSFMFWBODF #JHHFTUQSPCMFN4FBSDIJTIJHIMZTVCKFDUJWF • •
  3. What is elasticsearch? distributed restful search and analytics real time

    data real time analytics distributed • • • • document oriented conflict management schema free restful api per-operation • • • • •
  4. Installing After downloading the latest release and extracting it, elasticsearch

    can be started using: $ bin/elasticsearch To run foreground $ bin/elasticsearch -f It need set $JAVA_HOME variable • •
  5. Using # curl -X PUT http://localhost:9200/products/product/1 -d '{ "name" :

    “high quality search engine" }' # curl -X POST 'http://localhost:9200/products/product/_search' -d '{ "query" : { "match" : { "name" : " search"} } }' Insert Searching
  6. Logging Use famous Java logging library Log4j Seperate logging configuration

    (simplified log4j): config/logging.yml Keywords Log4j - http://logging.apache.org/log4j • • • •
  7. Sharding & Replication Sharding is index partitioning Split logical data

    into physically smaller parts Control data flows Replication is share same data over several machines Increasing throughput due to concurrency Allow outage of nodes without dataloss • • • •
  8. Mapping Each JSON field can be mapped to a specific

    core type. JSON itself already provides us with some typing, with its support string integer/long float/double boolean null • • • • • •
  9. Mapping core types array type object type root object type

    nested type • • • • • multi field type ip type geo point type geo shape type attachment type • • • • • http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/mapping-types.html Other types that elasticsearch supported
  10. Searching Different ways of searching Search queries match, term, prefix,

    id, fuzzy Counting only, Geo-based queries More like this, Highlighting Faceting, Percolation, Scripting Suggestions • • • • •
  11. Faceting Faceting allows aggregation of search results Term: Group results

    by a term Range: Group by price or date ranges Histogram: Group results in equally sized buckets, also as date histogram Statistical: Include statistical data like min, max, sum, avg & some more • • • • •
  12. Pluggable architecture Modularized architecture Plugins are simple zip files with

    a predefined layout Different plugin use-cases Lucene features Monitoring Scripting languages Rivers • • • • • • •
  13. Clients & integrations Tons of languages supported already Perl, Python,

    Ruby, PHP, JavaScript, .NET, Scala, Clojure, Erlang Lots integrations available Grails, Play Framework (1,2), spring & spring-data Django, Haystack, Catalyst, Node, Mongoose Wordpress, Drupal, Symfony2, CakePHP, Nagios, Munin, collectd, MCollective, chef • •
  14. Resources Elasticsearch official site - elasticsearch.org Introduction: Getting down and

    dirty with elasticsearch (Clinton Gormley) - http://www.slideshare.net/clintongormley/down-and- dirty-with-elasticsearch Explore your data with elasticsearch by Elasticsearch Inc - https://speakerdeck.com/elasticsearch/explore-your- data-with-elasticsearch Extending Elasticsearch by Alexander Reelsen - https://speakerdeck.com/spinscale/extending- elasticsearch • • • •