Upgrade to Pro — share decks privately, control downloads, hide ads and more …

A whirlwind tour through Lingual: ANSI SQL for Apache Hadoop

André Kelpe
November 14, 2013

A whirlwind tour through Lingual: ANSI SQL for Apache Hadoop

Slides from my #devoxx 2013 talk about Cascading Lingual

André Kelpe

November 14, 2013
Tweet

More Decks by André Kelpe

Other Decks in Programming

Transcript

  1. @fs111
    #DV13 #lingual
    A whirlwind tour through
    Lingual: ANSI SQL for Apache
    Hadoop
    André Kelpe
    concurrentinc.com

    View Slide

  2. @fs111
    #DV13 #lingual
    Speaker
    André Kelpe
    Software Engineer at Concurrent
    The company behind Cascading and Lingual
    concurrentinc.com / @concurrent
    [email protected]
    @fs111

    View Slide

  3. @fs111
    #DV13 #lingual
    Agenda
    Cascading and Lingual
    Lingual: design goals
    Lingual: features
    Demo: Lingual in action
    Q&A

    View Slide

  4. @fs111
    #DV13 #lingual
    Cascading
    http://cascading.org

    View Slide

  5. @fs111
    #DV13 #lingual
    Cascading terminology
    Taps are sources and sinks for data
    Schemes represent the format of the data
    Pipes are connecting Taps

    View Slide

  6. @fs111
    #DV13 #lingual
    Cascading terminology
    Tuples flow through Pipes
    Fields describe the Tuples
    Operations are executed on Tuples in
    TupleStreams
    Flows get scheduled and executed

    View Slide

  7. @fs111
    #DV13 #lingual
    Not just Java!

    View Slide

  8. @fs111
    #DV13 #lingual
    Scalding
    Scala DSL on top of Cascading
    Developed by twitter
    https://github.com/twitter/scalding/

    View Slide

  9. @fs111
    #DV13 #lingual
    Cascalog
    Clojure DSL on top of Cascading
    Inspired by datalog
    https://github.com/nathanmarz/cascalog

    View Slide

  10. @fs111
    #DV13 #lingual
    ANSI SQL via Lingual
    http://www.cascading.org/lingual/

    View Slide

  11. @fs111
    #DV13 #lingual
    Lingual – design goals 1/3
    Immediate Data Access
    SQL access via Shell or JDBC driver

    View Slide

  12. @fs111
    #DV13 #lingual
    Lingual – design goals 2/3
    Simplify SQL Migration
    move SQL workflows on your Hadoop Cluster
    via Cascading flows or JDBC driver

    View Slide

  13. @fs111
    #DV13 #lingual
    Lingual – design goals 3/3
    Simplify System
    & Data Integration
    Read and write from hdfs, jdbc, memcached,
    HBase, redshift...

    View Slide

  14. @fs111
    #DV13 #lingual
    Lingual – ANSI SQL on Cascading
    http://www.cascading.org/lingual/

    View Slide

  15. @fs111
    #DV13 #lingual
    Lingual and Cascading are
    about batch processing large
    amounts of data

    View Slide

  16. @fs111
    #DV13 #lingual
    Demo

    View Slide

  17. @fs111
    #DV13 #lingual
    Lingual – ANSI SQL on Cascading
    http://www.cascading.org/lingual/

    View Slide

  18. @fs111
    #DV13 #lingual
    Q&A
    @fs111 / http://concurrentinc.com
    http://cascading.org/lingual

    View Slide

  19. @fs111
    #DV13 #lingual
    Link collection
    http://www.cascading.org/lingual/
    http://www.cascading.org/
    http://docs.cascading.org/lingual/1.0/
    https://github.com/Cascading/
    http://concurrentinc.com
    https://groups.google.com/forum/#!forum/lingual-user
    https://groups.google.com/forum/#!forum/cascading-user
    http://docs.cascading.org/impatient/
    https://github.com/Cascading/vagrant-cascading-hadoop-cluster

    View Slide