Slide 1

Slide 1 text

@fs111 #DV13 #lingual A whirlwind tour through Lingual: ANSI SQL for Apache Hadoop André Kelpe concurrentinc.com

Slide 2

Slide 2 text

@fs111 #DV13 #lingual Speaker André Kelpe Software Engineer at Concurrent The company behind Cascading and Lingual concurrentinc.com / @concurrent [email protected] @fs111

Slide 3

Slide 3 text

@fs111 #DV13 #lingual Agenda Cascading and Lingual Lingual: design goals Lingual: features Demo: Lingual in action Q&A

Slide 4

Slide 4 text

@fs111 #DV13 #lingual Cascading http://cascading.org

Slide 5

Slide 5 text

@fs111 #DV13 #lingual Cascading terminology Taps are sources and sinks for data Schemes represent the format of the data Pipes are connecting Taps

Slide 6

Slide 6 text

@fs111 #DV13 #lingual Cascading terminology Tuples flow through Pipes Fields describe the Tuples Operations are executed on Tuples in TupleStreams Flows get scheduled and executed

Slide 7

Slide 7 text

@fs111 #DV13 #lingual Not just Java!

Slide 8

Slide 8 text

@fs111 #DV13 #lingual Scalding Scala DSL on top of Cascading Developed by twitter https://github.com/twitter/scalding/

Slide 9

Slide 9 text

@fs111 #DV13 #lingual Cascalog Clojure DSL on top of Cascading Inspired by datalog https://github.com/nathanmarz/cascalog

Slide 10

Slide 10 text

@fs111 #DV13 #lingual ANSI SQL via Lingual http://www.cascading.org/lingual/

Slide 11

Slide 11 text

@fs111 #DV13 #lingual Lingual – design goals 1/3 Immediate Data Access SQL access via Shell or JDBC driver

Slide 12

Slide 12 text

@fs111 #DV13 #lingual Lingual – design goals 2/3 Simplify SQL Migration move SQL workflows on your Hadoop Cluster via Cascading flows or JDBC driver

Slide 13

Slide 13 text

@fs111 #DV13 #lingual Lingual – design goals 3/3 Simplify System & Data Integration Read and write from hdfs, jdbc, memcached, HBase, redshift...

Slide 14

Slide 14 text

@fs111 #DV13 #lingual Lingual – ANSI SQL on Cascading http://www.cascading.org/lingual/

Slide 15

Slide 15 text

@fs111 #DV13 #lingual Lingual and Cascading are about batch processing large amounts of data

Slide 16

Slide 16 text

@fs111 #DV13 #lingual Demo

Slide 17

Slide 17 text

@fs111 #DV13 #lingual Lingual – ANSI SQL on Cascading http://www.cascading.org/lingual/

Slide 18

Slide 18 text

@fs111 #DV13 #lingual Q&A @fs111 / http://concurrentinc.com http://cascading.org/lingual

Slide 19

Slide 19 text

@fs111 #DV13 #lingual Link collection http://www.cascading.org/lingual/ http://www.cascading.org/ http://docs.cascading.org/lingual/1.0/ https://github.com/Cascading/ http://concurrentinc.com https://groups.google.com/forum/#!forum/lingual-user https://groups.google.com/forum/#!forum/cascading-user http://docs.cascading.org/impatient/ https://github.com/Cascading/vagrant-cascading-hadoop-cluster