Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Why you should care about SQL for big data and how Apache Calcite can help

Why you should care about SQL for big data and how Apache Calcite can help

FOSDEM 2017 talk: http://bit.ly/2xRV7By
Talk video: http://bit.ly/2xScodQ
When working with BigData & IoT systems we often feel the need for a Common Query Language. The platform specific languages are often harder to integrate with and require longer adoption time.

To fill this gap many NoSql (Not-only-Sql) vendors are building SQL layers for their platforms. It is worth exploring the driving forces behind this trend, how it fits in your BigData stacks and how we can adopt it in our favorite tools. However building SQL engine from scratch is a daunting job and frameworks like Apache Calcite can help you with the heavy lifting. Calcite allow you to integrate SQL parser, cost-based optimizer, and JDBC with your big data system.

Calcite has been used to empower many Big-Data platforms such as Hive, Spark, Drill Phoenix to name some.

I will walk you through the process of building a SQL access layer for Apache Geode (In-Memory Data Grid). I will share my experience, pitfalls and technical consideration like balancing between the SQL/RDBMS semantics and the design choices and limitations of the data system.

Hopefully this will enable you to add SQL capabilities to your prefered NoSQL data system.

"It will be interesting to see what happens if an established NoSQL database decides to implement a reasonably standard SQL; the only predictable outcome for such an eventuality is plenty of argument. - NoSQL Distilled, Martin Fowler - 2012"

The Relational Databases (RDBMS) are an essential component of the computing ecosystem. Yet in the past decade we have witnessed a wave of alternative data management technologies often branded as NoSQL and BigData - an ambiguous and lacking prescriptive definition names.

To understand the NoSQL/BigData "movement" one need to understand the forces fueling it:

* The rise of Internet (Web, Mobile, IoT...) leading to Data {Volume, Velocity and Variety} challenges
* Object-relational impedance mismatch
* Cloud computing - Infrastructure Automation and Elasticity
* Shift from Integration to Application databases
* Data-Value vs. Storage-Cost Economics Shift
The various approaches in addressing those challenges have led to a multitude of over 150 commercially supported NoSQL/BigData platforms.

Such diversity means that an organization will adopt a mixture of data storage technologies for handling different circumstances (Polyglot Persistence).

How does an organization integrate the mix of data technologies?

To fill the gap many NoSql/BigData vendors are (or are considering) building SQL and SQL-based layers for their platforms.

It is worth exploring the driving forces behind this trend ...

Christian Tzolov

February 04, 2017
Tweet

More Decks by Christian Tzolov

Other Decks in Technology

Transcript

  1. SQL for NoSQL and how
    Apache Calcite can help
    FOSDEM 2017

    View Slide

  2. Christian Tzolov
    2
    Engineer at Pivotal
    BigData, Hadoop, Spring Cloud Dataflow
    Apache Committer, PMC member
    Apache {Crunch, Geode, HAWQ, ...}
    Disclaimer This talk expresses my personal opinions. It is not read or approved by Pivotal and does not necessarily reflect the
    views and opinions of Pivotal nor does it constitute any official communication of Pivotal. Pivotal does not support any of the
    code shared here.
    blog.tzolov.net
    twitter.com/christzolov
    nl.linkedin.com/in/tzolov

    View Slide

  3. 3
    “It will be interesting to see what happens if an established
    NoSQL database decides to implement a reasonably
    standard SQL;
    The only predictable outcome for such an eventuality is plenty
    of argument.”
    2012, Martin Fowler, P.J.Sadalage, NoSQL Distilled

    View Slide

  4. Data Big Bang
    4
    Why?

    View Slide

  5. NoSQL Driving Forces
    5
    •  Infrastructure Automation and Elasticity (Cloud
    Computing)
    •  Rise of Internet Web, Mobile, IoT – Data Volume,
    Velocity, Variety challenges
    •  Row-based Relational Model. Object-Relational
    Impedance Mismatch
    ACID & 2PC clash with Distributed architectures. CAP, PAXOS instead..
    More convenient data models: Datastores, Key/Value, Graph, Columnar,
    Full-text Search, Schema-on-Load…
    Eliminate operational complexity and cost. Shift from Integration to application
    databases …

    View Slide

  6. Data Big Bang Implications
    6
    •  Over 150 commercial NoSQL and
    BigData systems.
    •  Organizations will have to mix data
    storage technologies!
    •  How to integrate such multitude of
    data systems?

    View Slide

  7. “Standard” Data Process/Query Language?
    7
    •  Functional - Unified Programming
    Model
    •  Apache {Beam, Spark, Flink,
    Apex, Crunch}, Cascading
    •  Converging around Apache
    Beam
    •  Declarative - SQL
    •  Adopted by many NoSQL
    Vendors
    •  Most Hadoop tasks: Hive and
    SQL-on-Hadoop
    •  Spark SQL - most used
    production component for 2016
    •  Google F1
    pcollection.apply(Read.from(”in.txt"))
    .apply(FlatMapElements.via((String word) ->
    asList(word.split("[^a-zA-Z']+")))
    .apply(Filter.by((String word)->!word.isEmpty()))
    .apply(Count.perElement())
    SELECT b."totalPrice", c."firstName”
    FROM "BookOrder" as b
    INNER JOIN "Customer" as c
    ON b."customerNumber" = c."customerNumber”
    WHERE b."totalPrice" > 0;
    Batch & Streaming, OLTP
    OLAP, EDW, Exploration

    View Slide

  8. SQL for NoSQL?
    8
    •  Extended Relational Algebra - already present in most NoSql data system
    •  Relational Expression Optimization – Desirable but hard to implement

    View Slide

  9. Organization Data - Integrated View
    9
    Single Federated DB (M:1:N)
    HAWQ FDBS
    NoSQL 1
    PXF 1
    Native
    API 1
    Apache HAWQ
    Optimizer, Columnar
    (HDFS)
    Organization Data Tools
    SQL/JDBC
    NoSQL 1
    PXF 2
    Native
    API 2
    NoSQL n
    PXF n
    Native
    API n

    Organization Data Tools
    NoSQL 1
    Calcite
    SQLAdapter 1
    SQL/JDBC
    NoSQL 2
    Calcite
    SQLAdapter 2
    SQL/JDBC
    NoSQL n
    Calcite
    SQLAdapter n
    SQL/JDBC

    Direct (M:N)
    https://issues.apache.org/jira/browse/HAWQ-1235

    View Slide

  10. Single Federated Database
    10
    Federated External Tables with Apache HAWQ - MPP, Shared-Noting, SQL-
    on-Hadoop
    CREATE EXTERNAL TABLE MyNoSQL
    (
    customer_id TEXT,
    first_name TEXT,
    last_name TEXT,
    gender TEXT
    )
    LOCATION ('pxf://MyNoSQL-URL>?
    FRAGMENTER=MyFragmenter&
    ACCESSOR=MyAccessor&
    RESOLVER=MyResolver&')
    FORMAT
    'custom'(formatter='pxfwritable_import');

    View Slide

  11. Apache Calcite?
    Java framework that allows SQL interface and advanced query optimization,
    for virtually any data system
    •  Query Parser, Validator and Optimizer(s)
    •  JDBC drivers - local and remote
    •  Agnostic to data storage and processing

    View Slide

  12. Calcite Application
    12
    •  Apache Apex
    •  Apache Drill
    •  Apache Flink
    •  Apache Hive
    •  Apache Kylin
    •  Apache Phoenix
    •  Apache Samza
    •  Apache Storm
    •  Cascading
    •  Qubole Quark
    •  SQL-Gremlin

    •  Apache Geode

    View Slide

  13. SQL Adapter Design Choices
    13
    SQL completeness vs. NoSql design integrity
    (simple) Predicate Pushdown: Scan, Filter,
    Projection
    (complex) Custom Relational Rules and
    Operations: Sort, Join, GroupBy ...
    Catalog – namespaces accessed in queries
    Schema - collection of schemas and tables
    Table - single data set, collection of rows
    RelDataType – SQL fields types in a Table
    •  Move Computation to Data
    •  Data Type Conversion

    View Slide

  14. Geode to Calcite Data Types Mapping
    14
    Geode Cache
    Region 1
    Region K
    Val
    Key
    v1
    k1
    v2
    k2

    Calcite Schema
    Table 1
    Table K
    Col1 Col2 ColN
    V(M,1)
    RowM V(M,2) V(M,N)
    V(2,1)
    Row2 V(2,2) V(2,N)
    V(1,1)
    Row1 V(1,2) V(1,N)

    Regions are mapped into Tables
    Geode Cache is mapped into Calcite Schema
    Geode Key/Value is mapped
    into Table Row
    Create Column Types
    (RelDataType) from Geode
    Value class
    (JavaTypeFactoryImpl)

    View Slide

  15. Geode Adapter - Overview
    Geode API and OQL
    SQL/JDBC/ODBC
    Convert SQL relational
    expressions into OQL queries
    Geode Adapter
    (Geode Client)
    Geode Server
    Geode Server
    Geode Server
    Data Data Data
    Push down the relational
    expressions supported by Geode
    OQL and falls back to the Calcite
    Enumerable Adapter for the rest
    Enumerable
    Adapter
    Apache Calcite
    Spring Data
    Geode
    Spring Data API for
    interacting with Geode
    Parse SQL, converts into
    relational expression and
    optimizes

    View Slide

  16. Simple SQ L Adapter
    16
    <>
    MySchemaFactory
    +create(operands):Schema
    <>
    <>
    MyTable
    +getRowType(RelDataTypeFactor)
    +scan(ctx):Ennumerator
    <>
    MySchema
    +getTableMap():Map)
    <>
    <>
    MyEnummerator
    +moveNext()
    +convert(Object):E
    My NoSQL
    <>
    <>
    defaultSchema: 'MyNoSQL',
    schemas: [{
    name: ’MyNoSQLAdapter,
    factory: MySchemaFactory’,
    operand: { myNoSqlUrl: …, }
    }]
    !connect jdbc:calcite:model=path-to-model.json
    Returns an Enumeration
    over the entire target
    data store
    Uses reflection to builds
    RelDataType from your
    value’s class type
    Converts MyNoSQL
    value response into
    Calcite row data
    Defined in the Linq4j
    sub-project
    ScannableTable,
    FilterableTable,
    ProjectableFilterableTable
    Initialize
    Query SELECT b."totalPrice”
    FROM "BookOrder" as b
    WHERE b."totalPrice" > 0;

    View Slide

  17. Non-Relational Tables (Simple)
    17
    Scanned without intermediate relational expression.
    •  ScannableTable - can be scanned
    •  FilterableTable - can be scanned, applying supplied filter expressions
    •  ProjectableFilterableTable - can be scanned, applying supplied filter expressions
    and projecting a given list of columns
    Enumerable scan(DataContext root, List filters, int[] projects);
    Enumerable scan(DataContext root, List filters);
    Enumerable scan(DataContext root);

    View Slide

  18. Calcite Ecosystem
    18
    Several “semi-independent” projects.
    JDBC and Avatica
    Linq4j
    Expression
    Tree
    Enumerable
    Adapter
    Relational
    •  Relational Expressions
    •  Row Expression
    •  Optimization Rules
    •  Planner …
    SQL Parser & AST
    Port of LINQ (Language-Integrated Query)
    to Java.
    Local and Remote
    JDBC driver
    Converts SQL
    queries Into AST
    (SqlNode …)
    3rd party Adapters
    Method for translating
    executable code into data
    (LINQ/MSN port)
    Default (In-memory) Data
    Store Adapter
    implementation.
    Leverages Linq4j
    Relational Algebra,
    expression,
    optimizations …
    Interpreter
    Complies Java code
    generated from linq4j
    “Expressions”. Part of the
    physical plan implementer

    View Slide

  19. Calcite SQL Query Execution Flow
    19
    Enumerable
    Interpreter
    Prepare
    SQL,
    Relational,
    Planner
    Geode
    Adapter
    Binder
    JDBC
    Geode
    Cluster
    1
    2
    3
    4
    5
    6 7
    7
    7
    2. Parse SQL, convert to rel.
    expressions. Validate and Optimize
    them
    3. Start building a physical plan from
    the relation expressions
    4. Implement the Geode relations and
    encode them as Expression tree
    5. Pass the Expression tree to the
    Interpreter to generate Java code
    6. Generate and Compile a Binder
    instance that on ‘bind()’ call runs
    Geodes’ query method
    1. On new SQL query JDBC delegates
    to Prepare to prepare the query
    execution
    7. JDBC uses the newly compiled
    Binder to perform the query on the
    Geode Cluster
    Calcite Framework
    Geode Adapter
    2

    View Slide

  20. Calcite Relational Expressions
    20
    RelNode
    Relational
    expression
    TableScan
    Project
    Filter
    Aggregate
    Join
    Intersect
    Sort
    RexlNode
    Row-level
    expressions
    Project, Sort fields
    Filter, Join conditions
    Input Column
    Ref
    Literal
    Struct field
    access
    Function call
    Window
    expressions
    *
    RelTrait
    *
    Physical attribute
    of a relation

    View Slide

  21. Calcite Relational Expressions
    21
    RelNode
    + register(RelOptPlander)
    + List getInputs();
    RelOptPlanner
    +findBestExp():RelNode
    RexNode
    RelTrait Convention
    NONE
    *
    *
    EnumberableConvention
    RelOptRule
    + onMatch(call)
    <>
    <>
    MyDBConvention
    ConverterRule
    + RelNode convert(RelNode)
    Converts from one calling
    convention to another
    Convertor
    Indicate that it converts a
    physical attribute only!
    <>
    *
    <>
    *
    <>
    Query optimizer: Transforms a
    relational expression according to
    a given set of rules and a cost model.
    RelOptCluster
    Rule transforms an expression into another. It has a list of
    Operands, which determine whether the rule can be applied to
    a particular section of the tree.
    RelOptRuleOperand
    *
    <>
    Calling convention used
    to represent a single
    data source.
    Inputs to a relational
    expression must be in
    the same convention

    View Slide

  22. Calcite Adapter Implementation Patterns
    22
    MyAdapterRel
    + implement(implContext)
    MyAdapterConvention
    Convention.Impl(“MyAdapter”)
    Common interface for all MyAdapter
    Relation Expressions. Provides
    implementation callback method called
    as part of physical plan implementation
    ImplContext
    + implParm1
    + implParm2 …
    RelNode
    MyAdapterTable
    + toRel(optTable)
    + asQueryable(provider,…)
    MyAdapterQueryable
    + myQuery(params) :
    Enumetator
    TranslatableTable
    <>
    AbstractQueryableTable
    AbstractTableQueryable <>
    Can convert
    queries in
    Expression
    myQuery() implements the call to your DB
    It is called by the auto generated code. It
    must return an Enumberable instance
    MyAdapterScan
    + register(planer) {
    Registers all MyAdapter Rules
    }
    <>
    MyAdapterToEnumerableConvertorRule
    operands: (RelNode.class,
    MyAdapterConvention, EnumerableConvention) ConverterRue
    TableScan
    MyAdapterToEnumerableConvertor
    + implement(EnumerableRelImplementor) {
    ctx = new MyAdapterRel.ImplContext()
    getImputs().implement(ctx)
    return BlockBuild.append( MY_QUERY_REF,
    Expressions.constant(ctx.implParms1),
    Expressions.constant(ctx.implParms2) …
    EnumerableRel
    ConvertorImpl
    <>
    MyAdapterProject
    MyAdapterFilter
    MyAdapterXXX
    RelOptRule
    MyAdapterProjectRu
    MyAdapterFilterRule
    MyAdapterXXXRule
    <>
    Recursively call the implement on each
    MyAdapter Relation Expression
    Encode the myQuery(params) call as
    Expressions
    MY_QUERY_REF = Types.lookupMethod(
    MyAdapterQueryable.class,
    ”myQuery”,
    String.class
    String.class);
    1
    3
    4
    5
    2
    6
    7
    8
    9
    Calcite Framework
    MyAdapter components

    View Slide

  23. Relational Algebra
    23
    Scan Scan
    Join
    Filter
    Project
    Customer [c] BookOrder [b]
    (on customerNumber)
    (b.totalPrice > 0)
    (c.firstName, b.totalPrice)
    SELECT b."totalPrice", c."firstName”
    FROM "BookOrder" as b
    INNER JOIN "Customer" as c ON b."customerNumber" = c."customerNumber”
    WHERE b."totalPrice" > 0;
    Scan Scan
    Join
    Project
    Customer [c] BookOrder [b]
    (on customerNumber)
    (totalPrice > 0)
    (c.firstName, b.totalPrice)
    Project
    (firstName,
    customerNumber)
    Filter
    (totalPrice,
    customerNumber)
    Project
    optimize

    View Slide

  24. Calcite with Geode - Without Implementation
    24
    SELECT b."totalPrice", c."firstName”
    FROM "BookOrder" as b
    INNER JOIN "Customer" as c ON b."customerNumber" = c."customerNumber”
    WHERE b."totalPrice" > 0;

    View Slide

  25. Calcite with Geode – Scannable Table (Simple)
    25
    SELECT b."totalPrice", c."firstName”
    FROM "BookOrder" as b
    INNER JOIN "Customer" as c ON b."customerNumber" = c."customerNumber”
    WHERE b."totalPrice" > 0;

    View Slide

  26. Calcite with Geode – Relational (Complex)
    26
    SELECT b."totalPrice", c."firstName”
    FROM "BookOrder" as b
    INNER JOIN "Customer" as c ON b."customerNumber" = c."customerNumber”
    WHERE b."totalPrice" > 0;

    View Slide

  27. Calcite JDBC Connection
    27

    View Slide

  28. References
    28
    •  Big Data is Four Different Problems, 2016, M.Stonebraker:
    https://www.youtube.com/watch?v=S79-buNhdhI
    •  Turning Database Inside-Out, 2015 (M. Kleppmann)
    https://www.confluent.io/blog/turning-the-database-inside-out-with-apache-samza
    •  NoSQL Distilled, 2012 (Pramod J. Sadalage and M.Fowler)
    https://martinfowler.com/books/nosql.html
    •  Architecture of a Database System, 2007 (J.M. Hellerstein, M. Stonebraker, J.
    Hamilton)http://db.cs.berkeley.edu/papers/fntdb07-architecture.pdf
    •  ORCA: A Modular Query Optimizer Architecture for Big Data:
    http://15721.courses.cs.cmu.edu/spring2016/papers/p337-soliman.pdf
    •  Apache Geode Project (2016) : http://geode.apache.org
    •  Geode Object Query Language (OQL) : http://bit.ly/2eKywgp
    •  Apache Calcite Project (2016) : https://calcite.apache.org
    •  Apache Geode Adapter for Apache Calcite: https://github.com/tzolov/calcite
    •  Relational Algebra Operations: https://www.coursera.org/learn/data-manipulation/lecture/
    4JKs1/relational-algebra-operators-union-difference-selection

    View Slide

  29. Thanks!

    View Slide

  30. Apache Geode?
    “… in-memory, distributed database
    with strong consistency built to
    support low latency transactional
    applications at extreme scale”

    View Slide

  31. Why Apache Geode?
    31
    5,700 train stations
    4.5 million tickets per day
    20 million daily users
    1.4 billion page views per day
    40,000 visits per second
    7,000 stations
    72,000 miles of track
    23 million passengers daily
    120,000 concurrent users
    10,000 transactions per minute
    https://pivotal.io/big-data/case-study/distributed-in-memory-data-management-solution
    https://pivotal.io/big-data/case-study/scaling-online-sales-for-the-largest-railway-in-the-world-china-railway-corporation
    China Railway

    View Slide

  32. Geode Features
    32
    •  In-Memory Data Storage
    –  Over 100TB Memory
    –  JVM Heap + Off Heap
    •  Any Data Format
    –  Key-Value/Object Store
    •  ACID and JTA Compliant
    Transactions
    •  HA and Linear Scalability
    •  Strong Consistency
    •  Streaming and Event Processing
    –  Listeners
    –  Distributed Functions
    –  Continuous OQL Queries
    •  Multi-site / Inter-cluster
    •  Full Text Search (Lucene indexes)
    •  Embedded and Standalone
    •  Top Level Apache Project

    View Slide

  33. Apache Geode Concepts
    Cache Server (member)
    Cache
    Region 1
    Region N
    Val
    Ke
    y
    v1
    k1
    v2
    k2

    Cache - In-memory collection
    of Regions
    Region - consistent, distributed
    Map (key-value),
    Partitioned or Replicated
    CacheServer – process
    connected to the distributed
    system with created Cache
    Client
    Locator (member)
    Client –read and modify the
    content of the distributed
    system
    Locator – tracks system
    members and provides
    membership information

    Listeners
    Functions
    Functions – distributed,
    concurrent data
    processing
    Listener – event handler.
    Registers for one or
    more events and notified
    when they occur

    View Slide

  34. Geode Topology
    Cache Server
    Cache Server
    Cache Server
    Cache Data Cache Data Cache Data
    Peer-to-Peer
    Cache Server
    Cache Server
    Cache Server
    Cache Data Cache Data Cache Data
    Client
    Local Cache
    pool
    Client-Server
    Cache Server
    Cache Server
    Gateway Sender

    Cache Server
    Gateway Receiver
    Cache Server
    Cache Server
    Cache Data Cache Data Cache Data Cache Data
    Gateway Receiver
    Cache Server
    … Gateway Sender
    Cache Server
    Cache Server
    Cache Data Cache Data Cache Data Cache Data
    WAN Multi-site Boundary
    Multi-Site

    View Slide