Upgrade to Pro — share decks privately, control downloads, hide ads and more …

All About Elasticsearch Language Clients [Elast...

All About Elasticsearch Language Clients [Elasticon 2015]

Honza Král, Karel Minařík, Zachary Tong: All About Elasticsearch Language Clients

A talk about the motivation, process and tools used when creating clients for the Elasticsearch's REST API in various programming languages.

Karel Minarik

March 11, 2015
Tweet

More Decks by Karel Minarik

Other Decks in Technology

Transcript

  1. { } The Motivation Clients are part of the experience

    Fragmentation and inconsistency Solving the same problems over and over again Lack of support for the full breadth of Elasticsearch's APIs Different assumptions about exposing the APIs
  2. { } Exhibit A: The Tire client for Ruby Incomplete

    support for query and filter types Mixing together a Ruby API and Rails integration Still widely used and liked
  3. { } Design Principles “Nobody should have a reason to

    not use the client” We have no opinions Respect the language
  4. { } The REST API Specification Describes the HTTP methods,

    the URL parts, allowed parameters, documentation links for every API Now part of the core Elasticsearch repository A communication tool
  5. { "index": { "documentation": "http://www.elasticsearch.org/guide/en/elasticsearch/reference/ "methods": ["POST", "PUT"], "url": {

    "path": "/{index}/{type}", "paths": ["/{index}/{type}", "/{index}/{type}/{id}"], "parts": { "id": { "type" : "string", "description" : "Document ID" }, "index": { "type" : "string", "required" : true, "description" : "The name of the index" }, "type": { https://github.com/elasticsearch/elasticsearch/blob/master/rest-api-spec/api/index.json
  6. { } $client->index([ 'index' => 'my_index', 'type' => 'my_type', 'id'

    => 1, 'body' => [ 'title' => 'Hello World!' ] ]); PHP
  7. { } $client->index( index => 'my_index', type => 'my_type', id

    => 1, body => { title => "Hello World!" } ); Perl
  8. { } client.index({ index: 'myindex', type: 'mytype', id: '1', body:

    { title: 'Hello World!' } }) .then(function (response) { //... }) .catch(function (error) { //... }); JavaScript
  9. { } Namespaced APIs Logical groups for the API client.cluster.health

    client->cluster->health() client.indices.create client->indices->create()
  10. { } The Common Integration Test Suite Verification of the

    specification “contract” YAML-based notation xUnit concepts, tailored to use-case “Runners” implemented in client languages Core developers contribute the tests Continuous integration
  11. setup: - do: index: index: test type: test id: testing_document

    body: body: Amsterdam meetup - do: indices.refresh: {} --- "Basic tests for suggest API": - do: suggest: body: test_suggestion: text: "The Amsterdma meetpu" term: field: body - match: {test_suggestion.1.options.0.text: amsterdam} https://github.com/elasticsearch/elasticsearch/blob/master/rest-api-spec/test/suggest/10_basic.yaml
  12. { } The Implementation Sketch Ruby and Python “Enough differences”

    to notice language peculiarities Working in parallel Blueprint for all languages
  13. { } The Implementation Sketch Focus on functionality and naming,

    not abstractions Easy to change and reason about Working code — no room for elaborate diagrams, endless speculation or abstract discussions
  14. CONNECTION CONNECTION HTTP LIBRARY 3 HTTP LIBRARY 2 … CLIENT

    High Level Architecture TRANSPORT CONNECTION POOL SELECTOR CONNECTION RANDOM ROUND ROBIN SNIFFER HTTP LIBRARY 1
  15. { } Tracer client = Elasticsearch::Client.new trace: true client.index index:

    'my_index', type: 'my_type', id: '1', body: { title: 'Hello World!' } curl -X PUT 'http://localhost:9200/my_index/my_type/1?pretty' -d '{ "title":"Hello World!" }' # 2015-03-10T07:55:37-07:00 [201] (0.270s) # # { # "_index":"my_index", # "_type":"my_type", # "_id":"1", # "_version":1, # "created":true # # }
  16. { } Selector Customization for different cluster topologies The “local

    rack” selector “Sticky sessions“ in PHP
  17. { } <?php namespace Elasticsearch\ConnectionPool\Selectors; use Elasticsearch\Connections\ConnectionInterface; class StickyRoundRobinSelector implements

    SelectorInterface { private $current = 0; private $currentCounter = 0; /** * Use current connection unless it is dead, otherwise round-robin */ public function select($connections) { if ($connections[$this->current]->isAlive()) { return $connections[$this->current]; } $this->currentCounter += 1; $this->current = $this->currentCounter % count($connections); return $connections[$this->current]; } } ?> https://github.com/elasticsearch/elasticsearch-php/blob/master/src/Elasticsearch/ConnectionPool/Selectors/StickyRoundRobinSelector.php
  18. { } Sniffer Make use of the Elasticsearch's dynamic nature

    Reuse the cluster state information Add and remove nodes dynamically Reload nodes list on failure or periodically
  19. { } from elasticsearch import Elasticsearch es = Elasticsearch( #

    sniff before doing anything sniff_on_start=True, # refresh nodes after a node fails to respond sniff_on_connection_fail=True, # and also every 60 seconds sniffer_timeout=60 ) https://www.elastic.co/webinars/the-why-and-what-about-python
  20. { } randomize_hosts By default, the client round-robins across the

    node list Prevent the “sequential load” effect in multi-process/ threaded environment Why not [host1, host2].shuffle ? Educate users about this fact and increase usability
  21. This work is licensed under the Creative Commons To view

    a copy of this license, visit: http://creativecommons.org/licenses/by-nd/4.0/ or send a letter to: Creative Commons PO Box 1866 Mountain View, CA 94042 USA