EuroClojure 2012: Introduction to Cascalog
by
Stefan Hübner
×
Copy
Open
Share
Embed
Copy iframe code
Copy JS code
Copy link
Start on current slide
Slide 1
Slide 1 text
Introduction to Cascalog Stefan Hübner, Nokia Berlin EuroClojure 2012
Slide 2
Slide 2 text
Nokia Maps • Address & POI Search • Category Search • Nearby Recommendations
Slide 3
Slide 3 text
Hadoop • Batch Processing • (Very) Large Scale • Distributed Filesystem • Parallel Computation • Fault-Tolerant
Slide 4
Slide 4 text
Hadoop • Batch Processing • (Very) Large Scale • Distributed Filesystem • Parallel Computation • Fault-Tolerant
Slide 5
Slide 5 text
Hadoop • Batch Processing • (Very) Large Scale • Distributed Filesystem • Parallel Computation • Fault-Tolerant
Slide 6
Slide 6 text
Hadoop MapReduce API • Tedious and verbose • Hard to test • Hard to refactor
Slide 7
Slide 7 text
• Tedious and verbose • Hard to test • Hard to refactor Hadoop MapReduce API
Slide 8
Slide 8 text
Pig and Hive • Define their own query language • Custom operations in Java, Python, ... • Non-intuitive integration
Slide 9
Slide 9 text
(())
Slide 10
Slide 10 text
Star Trek I - "The Motion Picture", Paramount Pictures V'Gr's question
Slide 11
Slide 11 text
Star Trek I - "The Motion Picture", Paramount Pictures "Ist das wirklich alles? Ist da sonst gar nichts mehr?"
Slide 12
Slide 12 text
Cascalog Cascalog Cascading Hadoop Abstraction Variables and logic Tuples, data workflows Key/value pairs, simple aggregation slide (c) Nathan Marz, reproduced with permission
Slide 13
Slide 13 text
Queries (<- ; defines a query [?person] ; output variables (age ?person ?age) ; generator with two variables (< ?age 30)) ; filter
Slide 14
Slide 14 text
Queries (<- [?person] (age ?person ?age) ; generator with two variables (< ?age 30)) ; filter Predicates
Slide 15
Slide 15 text
Predicates • Functions • Filters • Aggregators • Generators
Slide 16
Slide 16 text
No content
Slide 17
Slide 17 text
Thank you! Stefan Hübner, @sthuebner http://knowyourmeme.com/memes/cereal-guy