Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Tus primeros pasos con Elasticsearch

Tus primeros pasos con Elasticsearch

0f9c9bbecc4067b9bce445cb11ed5d53?s=128

David Padilla

July 07, 2015
Tweet

Transcript

  1. David Padilla Elasticsearch

  2. @dabit

  3. Elasticsearch

  4. None
  5. SELECT * FROM properties;

  6. None
  7. SELECT * FROM properties WHERE bedrooms = 2;

  8. None
  9. SELECT * FROM properties WHERE bedrooms = 2 AND bathrooms

    = 2;
  10. None
  11. SELECT * FROM properties INNER JOIN tasks ON tasks.property_id =

    properties.id WHERE bedrooms = 2 AND bathrooms = 3 AND tasks.pending = true;
  12. None
  13. SELECT * FROM properties INNER JOIN tasks ON tasks.property_id =

    properties.id WHERE bedrooms = 2 AND bathrooms = 3 AND tasks.pending = true AND name LIKE ('%query%');
  14. SELECT * FROM properties INNER JOIN tasks ON tasks.property_id =

    properties.id LEFT JOIN addresses ON addresses.id = property.id WHERE bedrooms = 2 AND bathrooms = 3 AND tasks.pending = true AND (name LIKE ('%query%') OR address.street LIKE('%query%') OR address.state LIKE ('%query%'));
  15. Elasticsearch

  16. None
  17. Java ew

  18. Shay Banon

  19. REST API Lucene

  20. REST API Lucene Cliente Cliente Cliente

  21. REST

  22. GET POST PUT DELETE

  23. Escalable

  24. REST API Lucene REST API Lucene REST API Lucene REST

    API Lucene
  25. None
  26. Alta disponibilidad

  27. None
  28. Índices Propiedades Documentos

  29. Analogía con SQL Índices Propiedades Documentos Tablas Columnas Filas Tipos

    Base de datos
  30. Schemaless

  31. Crear un índice POST localhost:9200/sistemix

  32. Crear un documento POST localhost:9200/sistemix/contacto { nombre: "Daniel Pliego", edad:

    25, intereses: ["diseño"] }
  33. POST sistemix/contacto { nombre: "Omaro Cancellara", edad: 30, intereses: ["angular",

    "rails"] } POST sistemix/contacto { nombre: "Alvaro Pereyra", edad: 25, intereses: ["node", "rails"] } POST sistemix/contacto { nombre: "Gloria Palma", edad: 20, intereses: ["angular", "cloud"] }
  34. Búsquedas

  35. GET sistemix/contacto/_search { query: { match: { nombre: "omaro" }

    } } match
  36. {"hits": [{"_index":"sistemix","_type":"contacto","_id":"AUxX 2Y-xjmAhoUOInopz", “_score":0.8784157,"_source":{ nombre: "Omaro Cancellara", edad: 30, intereses:

    ["angular", "rails"] } }]}}
  37. GET sistemix/contacto/_search { query: { term: { edad: 25 }

    } } term
  38. hits: { { nombre: "Daniel Pliego", edad: 25, intereses: ["diseño"]

    } ,{ nombre: "Alvaro Pereyra", edad: 25, intereses: ["node", "rails"] } }
  39. GET sistemix/contacto/_search { query: { term: { intereses: "rails" }

    } } term sobre arrays
  40. hits: { { nombre: "Omaro Cancellara", edad: 30, intereses: ["angular",

    "rails"] }, { nombre: "Alvaro Pereyra", edad: 25, intereses: ["node", "rails"] } }
  41. SELECT * FROM contacts INNER JOIN interests ON contact.id =

    interest.id WHERE interest.name = "rails";
  42. Índice Invertido 1 2 3 4 5 6 7 daniel

    x pliego x omaro x cancellara x gloria x palma x alvaro x pereyra x
  43. Tokenización Tokenize Char Filter Token Filter Limpiar caracteres no deseados

    Partir en tokens Limpiar tokens
  44. Char filter El hierro es el amigo más honesto que

    puedes tener El hierro es el amigo mas honesto que puedes tener
  45. Tokenize El hierro es el amigo mas honesto que puedes

    tener el hierro es el amigo mas honesto que puedes tener
  46. Token Filter El hierro es el amigo mas honesto que

    puedes tener hierro amigo honesto puedes tener
  47. Índice Invertido 1 2 hierro x honesto x amigo x

    puedes x tener x
  48. Búsquedas Parciales

  49. Poor man’s full text search WHERE nombre LIKE “oma*” WHERE

    nombre LIKE “dan*” WHERE nombre LIKE “*plie*”
  50. N-Gram Tokenizer

  51. Daniel dan dani daniel ani anie aniel nie niel

  52. GET sistemix/contacto/_search { query: { match: { nombre: "dan" }

    } }
  53. Índice Invertido 1 2 dan x dani x daniel x

    ani x anie x aniel x nie x niel x
  54. Aggregations

  55. None
  56. GET sistemix/contacto/_search { query: { term: { intereses: "rails" }

    } } aggregations
  57. aggregations GET sistemix/contacto/_search { query: { term: { intereses: "rails"

    } }, aggregations: { edades: { terms: { field: "edad" } } } }
  58. { nombre: "Omaro Cancellara", edad: 30, intereses: ["angular", "rails"] }

    },{ nombre: "Alvaro Pereyra", edad: 25, intereses: ["node", "rails"] } } }]}, "aggregations":{ "edades":{ "_type":"terms","missing":0,"total":2,"other":0, "terms": [ {"term":25,"count":1}, {"term":30,"count":1} ] } } }
  59. aggregations GET sistemix/contacto/_search { query: { term: { intereses: "rails"

    } }, aggregations: { edades_promedio: { avg: { field: "edad" } } } }
  60. { nombre: "Omaro Cancellara", edad: 30, intereses: ["angular", "rails"] }

    },{ nombre: "Alvaro Pereyra", edad: 25, intereses: ["node", "rails"] } } }]}, "aggregations":{ “edades_promedio”:{ { “value”: 27.5 } } }
  61. term average max min sum range date range

  62. Búsqueda Espacial

  63. { "pin" : { "location" : { "lat" : 40.12,

    "lon" : -71.34 } } }
  64. geo distance { "filtered" : { "query" : { "match_all"

    : {} }, "filter" : { "geo_distance" : { "distance" : "200km", "pin.location" : { "lat" : 40, "lon" : -70 } } } } }
  65. 200 km

  66. { "filtered" : { "query" : { "match_all" : {}

    }, "filter" : { "geo_bounding_box" : { "pin.location" : { "top_left" : { "lat" : 40.73, "lon" : -74.1 }, "bottom_right" : { "lat" : 40.01, "lon" : -71.12 } } } } } } geo bounding box
  67. None
  68. geo polygon { "filtered" : { "query" : { "match_all"

    : {} }, "filter" : { "geo_polygon" : { "person.location" : { "points" : [ {"lat" : 40, "lon" : -70}, {"lat" : 30, "lon" : -80}, {"lat" : 20, "lon" : -90} ] } } } } }
  69. None
  70. Conclusión

  71. Fácil de usar

  72. None
  73. Fácil de configurar

  74. ##################### Elasticsearch Configuration Example ##################### # This file contains an

    overview of various configuration settings, # targeted at operations staff. Application developers should # consult the guide at <http://elasticsearch.org/guide>. # # The installation procedure is covered at # <http://elasticsearch.org/guide/en/elasticsearch/reference/current/setup.html>. # # Elasticsearch comes with reasonable defaults for most settings, # so you can try it out without bothering with configuration. # # Most of the time, these defaults are just fine for running a production # cluster. If you're fine-tuning your cluster, or wondering about the # effect of certain configuration option, please _do ask_ on the # mailing list or IRC channel [http://elasticsearch.org/community]. # Any element in the configuration can be replaced with environment variables # by placing them in ${...} notation. For example: # #node.rack: ${RACK_ENV_VAR} # For information on supported formats and syntax for the config file, see # <http://elasticsearch.org/guide/en/elasticsearch/reference/current/setup-configuration.html> ################################### Cluster ################################### # Cluster name identifies your cluster for auto-discovery. If you're running # multiple clusters on the same network, make sure you're using unique names. # cluster.name: julian #################################### Node ##################################### # Node names are generated dynamically on startup, so you're relieved # from configuring them manually. You can tie this node to a specific name: # node.name: "Luna"
  75. Moderno

  76. Fin Fin

  77. @dabit david @ easybroker.com

  78. Desarrollo, Pizza y Cerveza 3er Jueves de cada mes http://www.meetup.com/

    chilango-rails
  79. None