Berlin Buzzwords 2012: Introduction to Cascalog
by
Stefan Hübner
×
Copy
Open
Share
Embed
Copy iframe code
Copy JS code
Copy link
Start on current slide
Slide 1
Slide 1 text
Introduction to Cascalog Stefan Hübner, Nokia Berlin Berlin Buzzwords 2012
Slide 2
Slide 2 text
Nokia Maps • Address & POI Search • Category Search • Nearby Recommendations
Slide 3
Slide 3 text
Hadoop • Batch Processing • (Very) Large Scale • Distributed Filesystem • Parallel Computation • Fault-Tolerant
Slide 4
Slide 4 text
Hadoop • Batch Processing • (Very) Large Scale • Distributed Filesystem • Parallel Computation • Fault-Tolerant
Slide 5
Slide 5 text
Hadoop MapReduce API • Tedious and verbose • Hard to test • Hard to refactor
Slide 6
Slide 6 text
• Tedious and verbose • Hard to test • Hard to refactor Hadoop MapReduce API
Slide 7
Slide 7 text
Pig and Hive • Define their own query language • Custom operations in Java, Python, ... • Non-intuitive integration
Slide 8
Slide 8 text
(())
Slide 9
Slide 9 text
Cascalog Cascalog Cascading Hadoop Abstraction Variables and logic Tuples, data workflows Key/value pairs, simple aggregation slide (c) Nathan Marz, reproduced with permission
Slide 10
Slide 10 text
Thank you! Stefan Hübner, @sthuebner http://knowyourmeme.com/memes/cereal-guy