Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Cascalog
Search
Sponsored
·
SiteGround - Reliable hosting with speed, security, and support you can count on.
→
αλεx π
May 22, 2013
Technology
160
4
Share
Cascalog
Short demo talk on Cascalog on Hadoop UG in Munich
αλεx π
May 22, 2013
More Decks by αλεx π
See All by αλεx π
Scalable Time Series With Cassandra
ifesdjeen
1
420
Bayesian Inference is known to make machines biased
ifesdjeen
2
400
Cassandra for Data Analytics Backends
ifesdjeen
7
460
Stream Processing and Functional Programming
ifesdjeen
1
780
PolyConf 2015 - Rocking the Time Series boat with C, Haskell and ClojureScript
ifesdjeen
0
520
Clojure - A Sweetspot for Analytics
ifesdjeen
8
2.1k
Going Off Heap
ifesdjeen
3
1.9k
Always be learning
ifesdjeen
1
190
Learn Yourself Emacs For Great Good workshop slides
ifesdjeen
3
350
Other Decks in Technology
See All in Technology
Databricks における 生成AIガバナンスの実践
taka_aki
1
300
AI Testing Talks: Challenges of Applying AI in Software Testing: From Hype to Practical Use
exactpro
PRO
1
120
大学生が本気でDatabricksを活用してDiscordサークルをデータ駆動させてみた
phantomjuju
1
400
Diagnosing performance problems without the guesswork
elenatanasoiu
0
160
生成 AI × MCP で切り拓く次世代 SRE!自律型運用への挑戦と開発者体験の進化
_awache
0
140
プラットフォームエンジニア ワークショップ/ platform-workshop
databricksjapan
1
260
React、まだ楽しくて草
uhyo
7
4k
Terraformモジュールは、なぜ「魔境」化するのか
hayama17
1
180
ポスター発表&デモと総括 / Poster Presentations & Demonstrations and Summary
ks91
PRO
0
190
ClearMLを活用した実験管理
sansantech
PRO
0
100
BigQuery の Cross-cloud Lakehouse への歩み
phaya72
2
550
そのPoC、何を検証したつもりでしたか? AIプロダクトの価値検証で陥った落とし穴
techtekt
PRO
0
140
Featured
See All Featured
JAMstack: Web Apps at Ludicrous Speed - All Things Open 2022
reverentgeek
1
460
So, you think you're a good person
axbom
PRO
2
2k
Data-driven link building: lessons from a $708K investment (BrightonSEO talk)
szymonslowik
1
1.1k
The Spectacular Lies of Maps
axbom
PRO
1
790
How To Speak Unicorn (iThemes Webinar)
marktimemedia
1
480
Abbi's Birthday
coloredviolet
2
7.9k
VelocityConf: Rendering Performance Case Studies
addyosmani
333
25k
Digital Projects Gone Horribly Wrong (And the UX Pros Who Still Save the Day) - Dean Schuster
uxyall
0
1.6k
Imperfection Machines: The Place of Print at Facebook
scottboms
270
14k
Paper Plane (Part 1)
katiecoart
PRO
0
8.5k
Evolution of real-time – Irina Nazarova, EuRuKo, 2024
irinanazarova
9
1.4k
How to audit for AI Accessibility on your Front & Back End
davetheseo
0
400
Transcript
Cascalog Hassle-free MapReduce that matches your scale Thursday, May 23,
13
Thursday, May 23, 13
Setting expecations •This is not a guide •And not a
tutorial •Doesn’t claim to be complete •Mostly to give you an idea •And encourage you to explore further Thursday, May 23, 13
How much time do you spend on writing logic that
framework should take care of? Thursday, May 23, 13
How easy is it to debug your map/reduce aggragation? Thursday,
May 23, 13
Hadoop + Java composable, but too vebrose Pig, Hive too
concrete, lack of abstraction and composition Thursday, May 23, 13
Thursday, May 23, 13
• Clear, declarative syntax • Inner and outer joins •
Aggregators • Functions • Subqueries, composition • Sorting • Performant Thursday, May 23, 13
Casca-WHAT? • Built on top of Hadoop (MapReduce) • Cascading
(tuples, workflows, job execution) • Written in Clojure • Datalog (logic programming) Thursday, May 23, 13
Abstract evrthn! Thursday, May 23, 13
Source where data pours from Thursday, May 23, 13
Pipe that data flows through Thursday, May 23, 13
Filter that makes sure that only good stuff goes through
Thursday, May 23, 13
Tuple they actually flow Thursday, May 23, 13
Thursday, May 23, 13
Query anatomy Thursday, May 23, 13
(?<- (stdout) [?person ?person-age] (age ?person ?person-age) (< ?person-age 30))
Thursday, May 23, 13
(?<- (stdout) [?person ?person-age] (age ?person ?person-age) (< ?person-age 30))
Output Thursday, May 23, 13
(?<- (stdout) [?person ?person-age] (age ?person ?person-age) (< ?person-age 30))
output vars Thursday, May 23, 13
(?<- (stdout) [?person ?person-age] (age ?person ?person-age) (< ?person-age 30))
Input input vars Thursday, May 23, 13
(?<- (stdout) [?person ?person-age] (age ?person ?person-age) (< ?person-age 30))
Logic/aggregations Thursday, May 23, 13
Sources and sinks • HDFS (go figure) • Cassandra •
MongoDB • SQL data sources • File system • Memory sources Thursday, May 23, 13
(?<- (stdout) [?person] (age ?person 25)) Exact match of second
element in a tuple Thursday, May 23, 13
(defn younger-than? [limit age] (< age limit)) (?<- (stdout) [?person
?age] (age ?person ?age) (younger-than? 32 ?age)) Predicate match, fn call Predicate Thursday, May 23, 13
(?<- (stdout) [?person ?count] (follows ?person _) (c/count ?count)) Aggregation
Thursday, May 23, 13
SHOWTIME! Thursday, May 23, 13
Benefits •Query language is same as application language •Subqueries, reusability
•Ad-hoc querying •Cascading underneath, so taps for all DBs work •Reuse application logic •Text editor integration Thursday, May 23, 13
@ifesdjeen (twitter/github) Thursday, May 23, 13