Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Elasticsearch SQL

Elasticsearch SQL

Do you find yourself pointing new Elasticsearch users every which way to various reference guides, language documentations, and example console snippets? Are you onboarding new Elasticsearch users to your company/organization and trying to get them up to speed? Come see what Costin's been cookin'!

In this session, you'll get a taste of a new Elastic {Re}search project and some of the recipes behind it. Costin will discuss how Elastic is working to make Elasticsearch more accessible to users, applications, and systems. Join us and bring your own ideas and an open mind.

Costin Leau l Software Engineer l Elastic

Elastic Co

March 08, 2017
Tweet

More Decks by Elastic Co

Other Decks in Technology

Transcript

  1. SQL characteristics Declarative Performs data definition, manipulation and control SQL:1999

    is Turing complete Can do fractals Fourth-generation programming language (4GL) https://wiki.postgresql.org/wiki/Mandelbrot_set
  2. 7 SELECT * FROM emp.emp SELECT last_name, first_date FROM emp.emp

    WHERE tenure > 20 What to do, not how to do it Declarative Language
  3. 1. SELECT 2. FROM 3. WHERE 4. GROUP BY 5.

    HAVING 6. ORDER BY Declaration vs execution order 8 2. FROM 3. WHERE 4. GROUP BY 5. HAVING 1. SELECT 6. ORDER BY
  4. Data/Tabular format SQL uses table references (not columns) Data is

    ‘pipelined’ to the user • consumers move back and forth • each row has a backing catalog/table (source) • data might be updated
  5. Support Elasticsearch features as much as possible Support standard SQL

    as much as possible Same experience out of the box as Elasticsearch • Lightweight & fast • Easy to pick up • Still Elasticsearch Whiteboard Goal – Elasticsearch SQL
  6. SQL landscape at a glance RDBMS Cloud solutions Dedicated SQL

    storage Generic, in-memory compute/SQL
  7. A SQL interpreter anatomy Analysis Planning Execution Parsing SQL QUERY

    Unresolved AST Resolved/Logical Plan Optimized Plan Physical Plan Client Results
  8. 14 Reusable Customized Highly-specific • Parsing • Domain model •

    Optimizer • Parsing • Domain model (plans) • Optimizer • Analysis • Physical plan • Execution Using an existing SQL foundation
  9. Read-only SQL Targets single index/type Features: 50% SQL, 50% Elasticsearch

    Runs inside Elasticsearch Zero runtime footprint Current iteration
  10. Schema employees birth_date emp_no first_name gender last_name salary tenure department

    dept_name dept_no from_date to_date 1..n employees birth_date emp_no first_name gender last_name salary tenure department dept_name dept_no from_date to_date 1..n Tables Nested docs
  11. Changes the plan without affecting its output Logical optimizations •

    Operator simplifications • Operator pruning • Execution simplification • Execution elimination Optimizer
  12. Optimizer SELECT last_name FROM emp.emp WHERE gender = 'M' AND

    1000 < emp_no AND NOT FALSE OR last_name LIKE 'K%' AND emp_no > 10010
  13. Query Planner Converts the logical plan into a physical plan

    SELECT last_name FROM emp.emp WHERE emp_no > 10010 {"query" : { "range" : { "emp_no" : { "from" : 10010, "to" : null }}}, "_source" : { "includes" :["last_name"] } }
  14. Runs the actual query Streams results Keep track of references

    Handles tabular format (aggs as a tree vs as a table) Data type conversion Query Executor
  15. Analyzed Full-text search Default for strings (text) Actual value depends

    on the analyzer By default has a not-analyzed field (keyword) Analyzed vs not-analyzed Not-analyzed Aggregations and exact matching Default for non-strings Required for sorting and aggregations
  16. "match" : { "gender" : { "query" : "M" }

    } Analyzed vs not-analyzed - Example SELECT … WHERE gender = 'M' "term" : { "gender" : { "value" : "M" } }
  17. "_source" : { "includes" : [ "birth_date" ] } Analyzed

    vs not-analyzed - Example "docvalue_fields" : [ "emp_no", "first_name", "gender", "last_name", "salary“, "tenure" ]
  18. "sort" : [{ "gender.keyword" : { "order" : "asc" }

    }] Sorting SELECT … ORDER BY gender
  19. Full-Text Search "match" : { "first_name" : { "query" :

    "Erez", "operator" : "OR“ } } SELECT … WHERE MATCH(first_name, ‘Erez’)
  20. Full-Text Search SELECT … WHERE MATCH(‘first_name, last_name^2’, ‘Erez’) "multi_match" :

    { "query" : "Erez", "fields" : [ "first_name^1.0", "last_name^2.0"] }
  21. Aggregate Ex: Avg, Count, Max Map onto metrics aggregations Functions

    Scalar Ex: Date extraction Painlessly through scripting Monotonic functions are well optimized
  22. Group By & Having SELECT column_name, aggregate_function(column_name) FROM ... WHERE

    ... GROUP BY column_name HAVING aggregate_function(column_name) operator value;
  23. SELECT column_name, aggregate_function(column_name) FROM ... WHERE ... GROUP BY column_name

    HAVING aggregate_function(column_name) operator value; Group By & Having Bucket Agg Pipeline Agg Metric Agg
  24. Supported through Nested and Parent-Child Joins employees birth_date emp_no first_name

    gender last_name salary tenure department dept_name dept_no from_date to_date 1..n Nested docs SELECT dep.dept_name, last_name FROM emp.emp WHERE emp_no = 10020 SELECT last_name FROM emp.emp WHERE dep.dept_name = 'Production'
  25. Dedicated client (driver) and server component JDBC 4.2/Java 8 (downgrade

    possible) Supports java.sql and javax.sql APIs Pays attention to details • Timeouts (connect vs read vs network) • Logging Light, without dependencies JDBC
  26. Except where otherwise noted, this work is licensed under http://creativecommons.org/licenses/by-nd/4.0/

    Creative Commons and the double C in a circle are registered trademarks of Creative Commons in the United States and other countries. Third party marks and brands are the property of their respective holders. 47 Please attribute Elastic with a link to elastic.co