Lock in $30 Savings on PRO—Offer Ends Soon! ⏳
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
AppsFlyer presenting: Cascalog, MapReduce for T...
Search
AppsFlyer
November 06, 2014
Programming
0
960
AppsFlyer presenting: Cascalog, MapReduce for The Code Craftsman
This presentation describe AppsFlyer's work with Hadoop in the Clojure production environment.
AppsFlyer
November 06, 2014
Tweet
Share
More Decks by AppsFlyer
See All by AppsFlyer
Processing 15 Billion events a day without breaking the bank - ReversimX ILTechTalks
appsflyer
0
500
Journey to the Real-Time Analytics in Extreme Growth
appsflyer
0
310
10 Real problems & solutions in your build and deploy process
appsflyer
0
150
DevOps paradigm in R&D day-to-day
appsflyer
0
160
Building a Mobile Backend to Evolve
appsflyer
0
110
Ido Barkan
appsflyer
1
150
Sometimes, Druid is not the best solution for a business use case
appsflyer
1
440
Processing 8 Billion Daily Events in Real Time!
appsflyer
1
130
React Performance
appsflyer
1
230
Other Decks in Programming
See All in Programming
開発に寄りそう自動テストの実現
goyoki
2
1.4k
LLM Çağında Backend Olmak: 10 Milyon Prompt'u Milisaniyede Sorgulamak
selcukusta
0
130
Giselleで作るAI QAアシスタント 〜 Pull Requestレビューに継続的QAを
codenote
0
300
生成AIを利用するだけでなく、投資できる組織へ
pospome
2
410
HTTPプロトコル正しく理解していますか? 〜かわいい猫と共に学ぼう。ฅ^•ω•^ฅ ニャ〜
hekuchan
2
420
フルサイクルエンジニアリングをAI Agentで全自動化したい 〜構想と現在地〜
kamina_zzz
0
290
ELYZA_Findy AI Engineering Summit登壇資料_AIコーディング時代に「ちゃんと」やること_toB LLMプロダクト開発舞台裏_20251216
elyza
2
620
生成AI時代を勝ち抜くエンジニア組織マネジメント
coconala_engineer
0
8.4k
【卒業研究】会話ログ分析によるユーザーごとの関心に応じた話題提案手法
momok47
0
120
perlをWebAssembly上で動かすと何が嬉しいの??? / Where does Perl-on-Wasm actually make sense?
mackee
0
130
これならできる!個人開発のすゝめ
tinykitten
PRO
0
130
Flutter On-device AI로 완성하는 오프라인 앱, 박제창 @DevFest INCHEON 2025
itsmedreamwalker
1
150
Featured
See All Featured
Side Projects
sachag
455
43k
AI Search: Implications for SEO and How to Move Forward - #ShenzhenSEOConference
aleyda
1
1k
The B2B funnel & how to create a winning content strategy
katarinadahlin
PRO
0
190
CSS Pre-Processors: Stylus, Less & Sass
bermonpainter
359
30k
GitHub's CSS Performance
jonrohan
1032
470k
Keith and Marios Guide to Fast Websites
keithpitt
413
23k
Chrome DevTools: State of the Union 2024 - Debugging React & Beyond
addyosmani
9
1k
Agile that works and the tools we love
rasmusluckow
331
21k
B2B Lead Gen: Tactics, Traps & Triumph
marketingsoph
0
32
I Don’t Have Time: Getting Over the Fear to Launch Your Podcast
jcasabona
34
2.6k
Why Mistakes Are the Best Teachers: Turning Failure into a Pathway for Growth
auna
0
28
A Modern Web Designer's Workflow
chriscoyier
698
190k
Transcript
Cascalog
AppsFlyer We use Clojure Used python previously Mobile app conversion
tracking
Challenges User Retention Cohort analysis ~0.5TB of data collected daily
What is Cascalog? A Clojure DSL for writing Hadoop jobs
On top of Cascading
Clojure Modern LISP for the JVM
Why Cascalog? We already know and love Clojure Same tools
- test in the REPL Custom operations are ordinary functions (no UDFs)
Why Cascalog? Cascalog is Datalog Fits our use cases well
Simple (once you know it )
Relations and Tuples Relation 1 Relation 2 t 1 t
2 t 3 t 4 t 1 t 2 t 3 t 4 Relational Model
Query Anatomy
Query Anatomy Order does not matter Grouping is implicit
Why Cascalog? Composition
Why Cascalog? Composition Joins are implicit
Generators Cascalog taps Hadoop and local file systems Clojure sequences
[["alice" 28] ["bob" 33] ["chris" 40] ["david" 25] ["emily" 25] ["george" 31]] Cascalog queries Defined using <-
Demo
The AppsFlyer flow Kafka Secor AWS S3 Data Collection Continuously
saves Kafka topics to HDFS/S3 according to a scheme
The AppsFlyer flow Data Processing Lemur Spins up Hadoop cluster
Submit steps Data processing (using cascalog) Export processed data (using Apache Sqoop) Postgresql 1 2 1 2
Questions ?