Large-scale, interac/ve ad-hoc queries over different datastores with Apache Drill

Large-‐scale, interac/ve ad-‐hoc queries over diﬀerent datastores with
Apache Drill Michael Hausenblas, Chief Data Engineer, MapR Technologies JAX London, 2013-‐10-‐29

Which workloads do you encounter in
your environment? hNp://www.ﬂickr.com/photos/kevinomara/2866648330/ licensed under CC BY-‐NC-‐ND 2.0

Batch processing … for recurring tasks such as large-‐scale
data mining, ETL oﬄoading/data-‐warehousing ! for the batch layer in Lambda architecture Apache Pig Cascalog

OLTP … user-‐facing eCommerce transac/ons, real-‐/me messaging at
scale (FB), /me-‐series processing, etc. ! for the serving layer in Lambda architecture

Stream processing … in order to handle stream sources
such as social media feeds or sensor data (mobile phones, RFID, weather sta/ons, etc.) ! for the speed layer in Lambda architecture

Search/Informa/on Retrieval … retrieval of items from unstructured documents
(plain text, etc.), semi-‐structured data formats (JSON, etc.), as well as data stores (MongoDB, CouchDB, etc.)

hNp://www.ﬂickr.com/photos/9479603@N02/4144121838/ licensed under CC BY-‐NC-‐ND 2.0 But what
about interac4ve ad-‐hoc query at scale?

Impala Interac/ve Query (?) low-‐latency

Use Case: Marke/ng Campaign •  Jane, a marke/ng analyst
•  Determine target segments •  Data from diﬀerent sources

Use Case: Logis/cs •  Supplier tracking and performance
•  Queries – Shipments from supplier ‘ACM’ in last 24h – Shipments in region ‘US’ not from ‘ACM’ SUPPLIER_ID NAME REGION ACM ACME Corp US GAL GotALot Inc US BAP Bits and Pieces Ltd Europe ZUP Zu Pli Asia { "shipment": 100123, "supplier": "ACM", “timestamp": "2013-02-01", "description": ”first delivery today” }, { "shipment": 100124, "supplier": "BAP", "timestamp": "2013-02-02", "description": "hope you enjoy it” } …

Use Case: Crime Detec/on •  Online purchases • 
Fraud, bilking, etc. •  Batch-‐generated overview •  Modes – Explora/ve – Alerts

Requirements •  Support for diﬀerent data sources • 
Support for diﬀerent query interfaces •  Low-‐latency/real-‐/me •  Ad-‐hoc queries •  Scalable, reliable

And now for something completely different …

Google’s Dremel hNp://research.google.com/pubs/pub36632.html Sergey Melnik, Andrey
Gubarev, Jing Jing Long, Geoﬀrey Romer, Shiva Shivakumar, Ma@ Tolton, Theo Vassilakis, Proc. of the 36th Int'l Conf on Very Large Data Bases (2010), pp. 330-‐339 Dremel is a scalable, interactive ad-hoc query system for analysis of read-only nested data. By combining multi-level execution trees and columnar data layout, it is capable of running aggregation queries over trillion-row tables in seconds. The system scales to thousands of CPUs and petabytes of data, and has thousands of users at Google. … “ “ Dremel is a scalable, interactive ad-hoc query system for analysis of read-only nested data. By combining multi-level execution trees and columnar data layout, it is capable of running aggregation queries over trillion-row tables in seconds. The system scales to thousands of CPUs and petabytes of data, and has thousands of users at Google. …

Google’s Dremel multi-level execution trees columnar data layout

Google’s Dremel nested data + schema column-striped representation
map nested data to tables

Google’s Dremel experiments: datasets & query performance

Back to Apache Drill …

Apache Drill–key facts •  Inspired by Google’s Dremel
•  Standard SQL 2003 support •  Plug-‐able data sources •  Nested data is a ﬁrst-‐class ci/zen •  Schema is op4onal •  Community driven, open, 100’s involved

High-‐level Architecture

Principled Query Execu/on •  Source query—what we want to
do (analyst friendly) •  Logical Plan— what we want to do (language agnos/c, computer friendly) •  Physical Plan—how we want to do it (the best way we can tell) •  Execu4on Plan—where we want to do it

Principled Query Execu/on Source Query Parser
Logical Plan Op/mizer Physical Plan Execu/on SQL 2003 DrQL MongoQL DSL scanner API Topology CF etc. query: [ { @id: "log", op: "sequence", do: [ { op: "scan", source: “logs” }, { op: "filter", condition: "x > 3” }, parser API

Wire-‐level Architecture •  Each node: Drillbit -‐ maximize data
locality •  Co-‐ordina/on, query planning, execu/on, etc, are distributed •  Any node can act as endpoint for a query—foreman Storage Process Drillbit node Storage Process Drillbit node Storage Process Drillbit node Storage Process Drillbit node

Wire-‐level Architecture •  Curator/Zookeeper for ephemeral cluster membership info
•  Distributed cache (Hazelcast) for metadata, locality informa/on, etc. Curator/Zk Distributed Cache Storage Process Drillbit node Storage Process Drillbit node Storage Process Drillbit node Storage Process Drillbit node Distributed Cache Distributed Cache Distributed Cache

Wire-‐level Architecture •  Origina/ng Drillbit acts as foreman: manages
query execu/on, scheduling, locality informa/on, etc. •  Streaming data communica4on avoiding SerDe Curator/Zk Distributed Cache Storage Process Drillbit node Storage Process Drillbit node Storage Process Drillbit node Storage Process Drillbit node Distributed Cache Distributed Cache Distributed Cache

Wire-‐level Architecture Foreman turns into root of the
mul/-‐level execu/on tree, leafs ac/vate their storage engine interface. node node node Curator/Zk

On the shoulders of giants … •  Jackson for
JSON SerDe for metadata •  Typesafe HOCON for conﬁgura/on and module management •  NeVy4 as core RPC engine, protobuf for communica/on •  Vanilla Java, Larray and NeVy ByteBuf for oﬀ-‐heap large data structures •  Hazelcast for distributed cache •  Nerlix Curator on top of Zookeeper for service registry •  Op4q for SQL parsing and cost op/miza/on •  Parquet (hNp://parquet.io)/ ORC •  Janino for expression compila/on •  ASM for ByteCode manipula/on •  Yammer Metrics for metrics •  Guava extensively •  Carrot HPC for primi/ve collec/ons

Key features •  Full SQL – ANSI SQL 2003
•  Nested Data as ﬁrst class ci/zen •  Op/onal Schema •  Extensibility Points …

Extensibility Points •  Source query à parser API
•  Custom operators, UDF à logical plan •  Serving tree, CF, topology à physical plan/op/mizer •  Data sources &formats à scanner API Source Query Parser Logical Plan Op/mizer Physical Plan Execu/on

User Interfaces •  API—DrillClient – Encapsulates endpoint discovery
– Supports logical and physical plan submission, query cancella/on, query status – Supports streaming return results •  JDBC driver, conver/ng JDBC into DrillClient communica/on. •  REST proxy for DrillClient

User Interfaces

LET’S GET OUR HANDS DIRTY…

Demo •  Install •  Prepara/on
$ wget hNp://people.apache.org/~jacques/apache-‐drill-‐1.0.0-‐m1.rc3/ apache-‐drill-‐1.0.0-‐m1-‐binary-‐release.tar.gz $ tar -‐zxf apache-‐drill-‐1.0.0-‐m1-‐binary-‐release.tar.gz $ export JAVA_HOME=/Library/Java/JavaVirtualMachines/jdk1.7.0_11.jdk/ Contents/Home $ export DRILL_LOG_DIR=$PWD $ ./bin/drillbit.sh start

Demo: submizng physical plan in a 3-‐node cluster
Test 1: Scan JSON doc $ bin/submit_plan -‐f sample-‐data/physical_json_scan_test1.json -‐t physical -‐zk 127.0.0.1:2181 Test 2: Scan Parquet doc $ bin/submit_plan -‐f sample-‐data/parquet_scan_union_screen_physical.json -‐t physical -‐zk 127.0.0.1:2181

Demo: SQL on single node $ ./bin/sqlline -‐u jdbc:drill:schema=parquet-‐local
0: jdbc:drill:schema=parquet-‐local> SELECT _MAP['N_REGIONKEY'] as regionKey, _MAP['N_NAME'] as name FROM "sample-‐data/na/on.parquet" WHERE cast(_MAP['N_NAME'] as varchar) < 'M';

Demo: DIY hNps://github.com/mhausenblas/apache-‐drill-‐sandbox/

Useful Resources •  Gezng Started guide hNps://github.com/vrtx/incubator-‐drill/blob/ gezng_started/docs/gezng_started.rst
•  Demo HowTo hNps://cwiki.apache.org/conﬂuence/display/DRILL/ Demo+HowTo •  How to build/install Apache Drill on Ubuntu 13.04 hNp://www.confusedcoders.com/bigdata/apache-‐ drill/how-‐to-‐build-‐apache-‐drill-‐on-‐ubuntu-‐13-‐04

BE A PART OF IT!

Status •  Heavy development by mul/ple organiza/ons (MapR,
Pentaho, Microso‚, Thoughtworks, XingCloud, etc.) •  Currently more than 100k LOC •  M1 Alpha available via hNp://www.apache.org/dyn/closer.cgi/incubator/drill/drill-‐1.0.0-‐m1-‐incuba/ng/

Kudos to … •  Julian Hyde, Pentaho
•  Lisen Mu, XingCloud •  Tim Chen, Microso‚ •  Chris Merrick, RJMetrics •  David Alves, UT Aus/n •  Sree Vaadi, SSS •  Srihari Srinivasan, ThoughtWorks •  Alexandre Beche, CERN •  Jason Altekruse, MapR hNp://incubator.apache.org/drill/team.html •  Ben Becker, MapR •  Jacques Nadeau, MapR •  Ted Dunning, MapR •  Keys Botzum, MapR •  Jason Frantz •  Ellen Friedman •  Chris Wensel, Concurrent •  Gera Shegalov, Oracle •  Ryan Rawson, Ohm Data

Contribu/ng Contribu/ons appreciated—not only code drops …
•  Test data & test queries •  Use case scenarios (textual/SQL queries) •  Documenta/on

Engage! •  Follow @ApacheDrill on TwiNer •  Sign
up at mailing lists (user | dev) hNp://incubator.apache.org/drill/mailing-‐lists.html •  Standing G+ hangouts every Tuesday at 5pm GMT hNp://j.mp/apache-‐drill-‐hangouts •  Keep an eye on hNp://drill-‐user.org/

Large-scale, interac/ve ad-hoc queries over dif...

Large-scale, interac/ve ad-hoc queries over different datastores with Apache Drill

More Decks by Michael Hausenblas

Other Decks in Technology

Featured

Transcript