Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Cascalog
Search
αλεx π
May 22, 2013
Technology
4
140
Cascalog
Short demo talk on Cascalog on Hadoop UG in Munich
αλεx π
May 22, 2013
Tweet
Share
More Decks by αλεx π
See All by αλεx π
Scalable Time Series With Cassandra
ifesdjeen
1
370
Bayesian Inference is known to make machines biased
ifesdjeen
2
370
Cassandra for Data Analytics Backends
ifesdjeen
7
430
Stream Processing and Functional Programming
ifesdjeen
1
740
PolyConf 2015 - Rocking the Time Series boat with C, Haskell and ClojureScript
ifesdjeen
0
460
Clojure - A Sweetspot for Analytics
ifesdjeen
8
2.1k
Going Off Heap
ifesdjeen
3
1.9k
Always be learning
ifesdjeen
1
140
Learn Yourself Emacs For Great Good workshop slides
ifesdjeen
3
330
Other Decks in Technology
See All in Technology
7月のガバクラ利用料が高かったので調べてみた
techniczna
3
810
【 LLMエンジニアがヒューマノイド開発に挑んでみた 】 - 第104回 Machine Learning 15minutes! Hybrid
soneo1127
0
240
AI時代にPdMとPMMはどう連携すべきか / PdM–PMM-collaboration-in-AI-era
rakus_dev
0
240
コスト削減の基本の「キ」~ コスト消費3大リソースへの対策 ~
smt7174
2
320
Snowflakeの生成AI機能を活用したデータ分析アプリの作成 〜Cortex AnalystとCortex Searchの活用とStreamlitアプリでの利用〜
nayuts
0
140
TypeScript入門
recruitengineers
PRO
35
11k
ここ一年のCCoEとしてのAWSコスト最適化を振り返る / CCoE AWS Cost Optimization devio2025
masahirokawahara
1
1.1k
RSCの時代にReactとフレームワークの境界を探る
uhyo
7
990
つくって納得、つかって実感! 大規模言語モデルことはじめ
recruitengineers
PRO
32
12k
DDD集約とサービスコンテキスト境界との関係性
pandayumi
2
200
MCPで変わる Amebaデザインシステム「Spindle」の開発
spindle
PRO
2
1.9k
DuckDB-Wasmを使って ブラウザ上でRDBMSを動かす
hacusk
1
140
Featured
See All Featured
VelocityConf: Rendering Performance Case Studies
addyosmani
332
24k
Done Done
chrislema
185
16k
Measuring & Analyzing Core Web Vitals
bluesmoon
9
570
Git: the NoSQL Database
bkeepers
PRO
431
66k
Java REST API Framework Comparison - PWX 2021
mraible
33
8.8k
RailsConf 2023
tenderlove
30
1.2k
The Psychology of Web Performance [Beyond Tellerrand 2023]
tammyeverts
49
3k
The Illustrated Children's Guide to Kubernetes
chrisshort
48
50k
CoffeeScript is Beautiful & I Never Want to Write Plain JavaScript Again
sstephenson
161
15k
Producing Creativity
orderedlist
PRO
347
40k
Gamification - CAS2011
davidbonilla
81
5.4k
Save Time (by Creating Custom Rails Generators)
garrettdimon
PRO
32
1.5k
Transcript
Cascalog Hassle-free MapReduce that matches your scale Thursday, May 23,
13
Thursday, May 23, 13
Setting expecations •This is not a guide •And not a
tutorial •Doesn’t claim to be complete •Mostly to give you an idea •And encourage you to explore further Thursday, May 23, 13
How much time do you spend on writing logic that
framework should take care of? Thursday, May 23, 13
How easy is it to debug your map/reduce aggragation? Thursday,
May 23, 13
Hadoop + Java composable, but too vebrose Pig, Hive too
concrete, lack of abstraction and composition Thursday, May 23, 13
Thursday, May 23, 13
• Clear, declarative syntax • Inner and outer joins •
Aggregators • Functions • Subqueries, composition • Sorting • Performant Thursday, May 23, 13
Casca-WHAT? • Built on top of Hadoop (MapReduce) • Cascading
(tuples, workflows, job execution) • Written in Clojure • Datalog (logic programming) Thursday, May 23, 13
Abstract evrthn! Thursday, May 23, 13
Source where data pours from Thursday, May 23, 13
Pipe that data flows through Thursday, May 23, 13
Filter that makes sure that only good stuff goes through
Thursday, May 23, 13
Tuple they actually flow Thursday, May 23, 13
Thursday, May 23, 13
Query anatomy Thursday, May 23, 13
(?<- (stdout) [?person ?person-age] (age ?person ?person-age) (< ?person-age 30))
Thursday, May 23, 13
(?<- (stdout) [?person ?person-age] (age ?person ?person-age) (< ?person-age 30))
Output Thursday, May 23, 13
(?<- (stdout) [?person ?person-age] (age ?person ?person-age) (< ?person-age 30))
output vars Thursday, May 23, 13
(?<- (stdout) [?person ?person-age] (age ?person ?person-age) (< ?person-age 30))
Input input vars Thursday, May 23, 13
(?<- (stdout) [?person ?person-age] (age ?person ?person-age) (< ?person-age 30))
Logic/aggregations Thursday, May 23, 13
Sources and sinks • HDFS (go figure) • Cassandra •
MongoDB • SQL data sources • File system • Memory sources Thursday, May 23, 13
(?<- (stdout) [?person] (age ?person 25)) Exact match of second
element in a tuple Thursday, May 23, 13
(defn younger-than? [limit age] (< age limit)) (?<- (stdout) [?person
?age] (age ?person ?age) (younger-than? 32 ?age)) Predicate match, fn call Predicate Thursday, May 23, 13
(?<- (stdout) [?person ?count] (follows ?person _) (c/count ?count)) Aggregation
Thursday, May 23, 13
SHOWTIME! Thursday, May 23, 13
Benefits •Query language is same as application language •Subqueries, reusability
•Ad-hoc querying •Cascading underneath, so taps for all DBs work •Reuse application logic •Text editor integration Thursday, May 23, 13
@ifesdjeen (twitter/github) Thursday, May 23, 13