Upgrade to Pro — share decks privately, control downloads, hide ads and more …

All About Elasticsearch Language Clients [Elasticon 2015]

All About Elasticsearch Language Clients [Elasticon 2015]

Honza Král, Karel Minařík, Zachary Tong: All About Elasticsearch Language Clients

A talk about the motivation, process and tools used when creating clients for the Elasticsearch's REST API in various programming languages.

Karel Minarik

March 11, 2015
Tweet

More Decks by Karel Minarik

Other Decks in Technology

Transcript

  1. All About
    Elasticsearch
    Language Clients
    Honza Král
    Karel Minařík
    Zachary Tong

    View full-size slide

  2. The motivation, the design process and the architecture
    behind the official clients for Elasticsearch.

    View full-size slide

  3. { }
    The Motivation
    Clients are part of the experience
    Fragmentation and inconsistency
    Solving the same problems over and over again
    Lack of support for the full breadth of Elasticsearch's APIs
    Different assumptions about exposing the APIs

    View full-size slide

  4. { }
    Exhibit A: The Tire client for Ruby
    Incomplete support for query and filter types
    Mixing together a Ruby API and Rails integration
    Still widely used and liked

    View full-size slide

  5. { }
    Design Principles
    “Nobody should have a reason to not use the client”
    We have no opinions
    Respect the language

    View full-size slide

  6. { }
    The Groundwork
    The specification of the Elasticsearch REST API
    The common integration test suite

    View full-size slide

  7. { }
    The REST API Specification
    Describes the HTTP methods, the URL parts, allowed
    parameters, documentation links for every API
    Now part of the core Elasticsearch repository
    A communication tool

    View full-size slide

  8. {
    "index": {
    "documentation": "http://www.elasticsearch.org/guide/en/elasticsearch/reference/
    "methods": ["POST", "PUT"],
    "url": {
    "path": "/{index}/{type}",
    "paths": ["/{index}/{type}", "/{index}/{type}/{id}"],
    "parts": {
    "id": {
    "type" : "string",
    "description" : "Document ID"
    },
    "index": {
    "type" : "string",
    "required" : true,
    "description" : "The name of the index"
    },
    "type": { https://github.com/elasticsearch/elasticsearch/blob/master/rest-api-spec/api/index.json

    View full-size slide

  9. { }
    client.index index: 'my_index',
    type: 'my_type',
    id: '1',
    body: {
    title: 'Hello World!'
    }
    Ruby

    View full-size slide

  10. { }
    client.index(
    index='my_index',
    doc_type='my_type',
    id=1,
    body={'title':'Hello World!'}
    )
    Python

    View full-size slide

  11. { }
    $client->index([
    'index' => 'my_index',
    'type' => 'my_type',
    'id' => 1,
    'body' => [
    'title' => 'Hello World!'
    ]
    ]);
    PHP

    View full-size slide

  12. { }
    $client->index(
    index => 'my_index',
    type => 'my_type',
    id => 1,
    body => { title => "Hello World!" }
    );
    Perl

    View full-size slide

  13. { }
    client.index({
    index: 'myindex',
    type: 'mytype',
    id: '1',
    body: {
    title: 'Hello World!'
    }
    })
    .then(function (response) {
    //...
    })
    .catch(function (error) {
    //...
    });
    JavaScript

    View full-size slide

  14. { }
    client.Index("my_index", "my_type", "1",
    new {
    title = "Hello World!"
    }
    );
    C#

    View full-size slide

  15. { }
    Namespaced APIs
    Logical groups for the API
    client.cluster.health
    client->cluster->health()
    client.indices.create
    client->indices->create()

    View full-size slide

  16. { }
    The Common Integration Test Suite
    Verification of the specification “contract”
    YAML-based notation
    xUnit concepts, tailored to use-case
    “Runners” implemented in client languages
    Core developers contribute the tests
    Continuous integration

    View full-size slide

  17. setup:
    - do:
    index:
    index: test
    type: test
    id: testing_document
    body:
    body: Amsterdam meetup
    - do:
    indices.refresh: {}
    ---
    "Basic tests for suggest API":
    - do:
    suggest:
    body:
    test_suggestion:
    text: "The Amsterdma meetpu"
    term:
    field: body
    - match: {test_suggestion.1.options.0.text: amsterdam}
    https://github.com/elasticsearch/elasticsearch/blob/master/rest-api-spec/test/suggest/10_basic.yaml

    View full-size slide

  18. { }
    The Implementation Sketch
    Ruby and Python
    “Enough differences” to notice language peculiarities
    Working in parallel
    Blueprint for all languages

    View full-size slide

  19. { }
    The Implementation Sketch
    Focus on functionality and naming, not abstractions
    Easy to change and reason about
    Working code — no room for elaborate diagrams,
    endless speculation or abstract discussions

    View full-size slide

  20. CONNECTION
    CONNECTION
    HTTP LIBRARY 3
    HTTP LIBRARY 2

    CLIENT
    High Level Architecture
    TRANSPORT
    CONNECTION POOL SELECTOR
    CONNECTION
    RANDOM
    ROUND ROBIN
    SNIFFER
    HTTP LIBRARY 1

    View full-size slide

  21. { }
    Tracer
    client = Elasticsearch::Client.new trace: true
    client.index index: 'my_index',
    type: 'my_type',
    id: '1',
    body: {
    title: 'Hello World!'
    }
    curl -X PUT 'http://localhost:9200/my_index/my_type/1?pretty' -d '{
    "title":"Hello World!"
    }'
    # 2015-03-10T07:55:37-07:00 [201] (0.270s)
    #
    # {
    # "_index":"my_index",
    # "_type":"my_type",
    # "_id":"1",
    # "_version":1,
    # "created":true
    #
    # }

    View full-size slide

  22. { }
    Selector
    Customization for different cluster topologies
    The “local rack” selector
    “Sticky sessions“ in PHP

    View full-size slide

  23. { }
    namespace Elasticsearch\ConnectionPool\Selectors;
    use Elasticsearch\Connections\ConnectionInterface;
    class StickyRoundRobinSelector implements SelectorInterface
    {
    private $current = 0;
    private $currentCounter = 0;
    /**
    * Use current connection unless it is dead, otherwise round-robin
    */
    public function select($connections)
    {
    if ($connections[$this->current]->isAlive()) {
    return $connections[$this->current];
    }
    $this->currentCounter += 1;
    $this->current = $this->currentCounter % count($connections);
    return $connections[$this->current];
    }
    }
    ?>
    https://github.com/elasticsearch/elasticsearch-php/blob/master/src/Elasticsearch/ConnectionPool/Selectors/StickyRoundRobinSelector.php

    View full-size slide

  24. { }
    Sniffer
    Make use of the Elasticsearch's dynamic nature
    Reuse the cluster state information
    Add and remove nodes dynamically
    Reload nodes list on failure or periodically

    View full-size slide

  25. { }
    from elasticsearch import Elasticsearch
    es = Elasticsearch(
    # sniff before doing anything
    sniff_on_start=True,
    # refresh nodes after a node fails to respond
    sniff_on_connection_fail=True,
    # and also every 60 seconds
    sniffer_timeout=60
    )
    https://www.elastic.co/webinars/the-why-and-what-about-python

    View full-size slide

  26. { }
    randomize_hosts
    By default, the client round-robins across the node list
    Prevent the “sequential load” effect in multi-process/
    threaded environment
    Why not [host1, host2].shuffle
    ?
    Educate users about this fact and increase usability

    View full-size slide

  27. An Elasticsearch client
    is much more than
    “just HTTP and JSON”
    wrapper

    View full-size slide

  28. “Nobody
    should have a reason
    to not use the client”

    View full-size slide

  29. { }
    Thank you! Questions!

    View full-size slide

  30. This work is licensed under the
    Creative Commons
    To view a copy of this license, visit:
    http://creativecommons.org/licenses/by-nd/4.0/
    or send a letter to:
    Creative Commons
    PO Box 1866
    Mountain View, CA 94042
    USA

    View full-size slide