Upgrade to Pro — share decks privately, control downloads, hide ads and more …

All About Elasticsearch's Language Clients (Honza Kral, Karel Minarik and Zachary Tong)

Dd9d954997353b37b4c2684f478192d3?s=47 Elastic Co
March 10, 2015

All About Elasticsearch's Language Clients (Honza Kral, Karel Minarik and Zachary Tong)

In this talk, we will walk you through the process of designing and implementing the official clients for Elasticsearch. We will explain the design process, cover the tools for making sure the clients have a consistent API, the testing infrastructure, explain the features of the clients in depth, and address the differences between the implementation in various programming languages. After this talk, attendees should have a solid understanding of features of the Elasticsearch language clients and the design decisions behind them.

Dd9d954997353b37b4c2684f478192d3?s=128

Elastic Co

March 10, 2015
Tweet

More Decks by Elastic Co

Other Decks in Technology

Transcript

  1. All About Elasticsearch Language Clients Honza Král Karel Minařík Zachary

    Tong
  2. The motivation, the design process and the architecture behind the

    official clients for Elasticsearch.
  3. { } The Motivation Clients are part of the experience

    Fragmentation and inconsistency Solving the same problems over and over again Lack of support for the full breadth of Elasticsearch's APIs Different assumptions about exposing the APIs
  4. { } Exhibit A: The Tire client for Ruby Incomplete

    support for query and filter types Mixing together a Ruby API and Rails integration Still widely used and liked
  5. { } Design Principles “Nobody should have a reason to

    not use the client” We have no opinions Respect the language
  6. { } The Groundwork The specification of the Elasticsearch REST

    API The common integration test suite
  7. { } The REST API Specification Describes the HTTP methods,

    the URL parts, allowed parameters, documentation links for every API Now part of the core Elasticsearch repository A communication tool
  8. { "index": { "documentation": "http://www.elasticsearch.org/guide/en/elasticsearch/reference/ "methods": ["POST", "PUT"], "url": {

    "path": "/{index}/{type}", "paths": ["/{index}/{type}", "/{index}/{type}/{id}"], "parts": { "id": { "type" : "string", "description" : "Document ID" }, "index": { "type" : "string", "required" : true, "description" : "The name of the index" }, "type": { https://github.com/elasticsearch/elasticsearch/blob/master/rest-api-spec/api/index.json
  9. { } client.index index: 'my_index', type: 'my_type', id: '1', body:

    { title: 'Hello World!' } Ruby
  10. { } client.index( index='my_index', doc_type='my_type', id=1, body={'title':'Hello World!'} ) Python

  11. { } $client->index([ 'index' => 'my_index', 'type' => 'my_type', 'id'

    => 1, 'body' => [ 'title' => 'Hello World!' ] ]); PHP
  12. { } $client->index( index => 'my_index', type => 'my_type', id

    => 1, body => { title => "Hello World!" } ); Perl
  13. { } client.index({ index: 'myindex', type: 'mytype', id: '1', body:

    { title: 'Hello World!' } }) .then(function (response) { //... }) .catch(function (error) { //... }); JavaScript
  14. { } client.Index("my_index", "my_type", "1", new { title = "Hello

    World!" } ); C#
  15. { } Namespaced APIs Logical groups for the API client.cluster.health

    client->cluster->health() client.indices.create client->indices->create()
  16. { } The Common Integration Test Suite Verification of the

    specification “contract” YAML-based notation xUnit concepts, tailored to use-case “Runners” implemented in client languages Core developers contribute the tests Continuous integration
  17. setup: - do: index: index: test type: test id: testing_document

    body: body: Amsterdam meetup - do: indices.refresh: {} --- "Basic tests for suggest API": - do: suggest: body: test_suggestion: text: "The Amsterdma meetpu" term: field: body - match: {test_suggestion.1.options.0.text: amsterdam} https://github.com/elasticsearch/elasticsearch/blob/master/rest-api-spec/test/suggest/10_basic.yaml
  18. { } The Implementation Sketch Ruby and Python “Enough differences”

    to notice language peculiarities Working in parallel Blueprint for all languages
  19. { } The Implementation Sketch Focus on functionality and naming,

    not abstractions Easy to change and reason about Working code — no room for elaborate diagrams, endless speculation or abstract discussions
  20. CONNECTION CONNECTION HTTP LIBRARY 3 HTTP LIBRARY 2 … CLIENT

    High Level Architecture TRANSPORT CONNECTION POOL SELECTOR CONNECTION RANDOM ROUND ROBIN SNIFFER HTTP LIBRARY 1
  21. { } Tracer client = Elasticsearch::Client.new trace: true client.index index:

    'my_index', type: 'my_type', id: '1', body: { title: 'Hello World!' } curl -X PUT 'http://localhost:9200/my_index/my_type/1?pretty' -d '{ "title":"Hello World!" }' # 2015-03-10T07:55:37-07:00 [201] (0.270s) # # { # "_index":"my_index", # "_type":"my_type", # "_id":"1", # "_version":1, # "created":true # # }
  22. { } Selector Customization for different cluster topologies The “local

    rack” selector “Sticky sessions“ in PHP
  23. { } <?php namespace Elasticsearch\ConnectionPool\Selectors; use Elasticsearch\Connections\ConnectionInterface; class StickyRoundRobinSelector implements

    SelectorInterface { private $current = 0; private $currentCounter = 0; /** * Use current connection unless it is dead, otherwise round-robin */ public function select($connections) { if ($connections[$this->current]->isAlive()) { return $connections[$this->current]; } $this->currentCounter += 1; $this->current = $this->currentCounter % count($connections); return $connections[$this->current]; } } ?> https://github.com/elasticsearch/elasticsearch-php/blob/master/src/Elasticsearch/ConnectionPool/Selectors/StickyRoundRobinSelector.php
  24. { } Sniffer Make use of the Elasticsearch's dynamic nature

    Reuse the cluster state information Add and remove nodes dynamically Reload nodes list on failure or periodically
  25. { } from elasticsearch import Elasticsearch es = Elasticsearch( #

    sniff before doing anything sniff_on_start=True, # refresh nodes after a node fails to respond sniff_on_connection_fail=True, # and also every 60 seconds sniffer_timeout=60 ) https://www.elastic.co/webinars/the-why-and-what-about-python
  26. { } randomize_hosts By default, the client round-robins across the

    node list Prevent the “sequential load” effect in multi-process/ threaded environment Why not [host1, host2].shuffle ? Educate users about this fact and increase usability
  27. An Elasticsearch client is much more than “just HTTP and

    JSON” wrapper
  28. “Nobody should have a reason to not use the client”

  29. { } Thank you! Questions!

  30. This work is licensed under the Creative Commons To view

    a copy of this license, visit: http://creativecommons.org/licenses/by-nd/4.0/ or send a letter to: Creative Commons PO Box 1866 Mountain View, CA 94042 USA