Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
AppsFlyer presenting: Cascalog, MapReduce for T...
Search
Sponsored
·
Your Podcast. Everywhere. Effortlessly.
Share. Educate. Inspire. Entertain. You do you. We'll handle the rest.
→
AppsFlyer
November 06, 2014
Programming
0
960
AppsFlyer presenting: Cascalog, MapReduce for The Code Craftsman
This presentation describe AppsFlyer's work with Hadoop in the Clojure production environment.
AppsFlyer
November 06, 2014
Tweet
Share
More Decks by AppsFlyer
See All by AppsFlyer
Processing 15 Billion events a day without breaking the bank - ReversimX ILTechTalks
appsflyer
0
500
Journey to the Real-Time Analytics in Extreme Growth
appsflyer
0
310
10 Real problems & solutions in your build and deploy process
appsflyer
0
150
DevOps paradigm in R&D day-to-day
appsflyer
0
160
Building a Mobile Backend to Evolve
appsflyer
0
120
Ido Barkan
appsflyer
1
160
Sometimes, Druid is not the best solution for a business use case
appsflyer
1
440
Processing 8 Billion Daily Events in Real Time!
appsflyer
1
130
React Performance
appsflyer
1
230
Other Decks in Programming
See All in Programming
15年続くIoTサービスのSREエンジニアが挑む分散トレーシング導入
melonps
2
200
AIによる開発の民主化を支える コンテキスト管理のこれまでとこれから
mulyu
3
300
0→1 フロントエンド開発 Tips🚀 #レバテックMeetup
bengo4com
0
570
生成AIを使ったコードレビューで定性的に品質カバー
chiilog
1
270
AgentCoreとHuman in the Loop
har1101
5
240
AIと一緒にレガシーに向き合ってみた
nyafunta9858
0
240
高速開発のためのコード整理術
sutetotanuki
1
400
AIエージェント、”どう作るか”で差は出るか? / AI Agents: Does the "How" Make a Difference?
rkaga
4
2k
なぜSQLはAIぽく見えるのか/why does SQL look AI like
florets1
0
460
Raku Raku Notion 20260128
hareyakayuruyaka
0
280
ノイジーネイバー問題を解決する 公平なキューイング
occhi
0
100
Oxlint JS plugins
kazupon
1
960
Featured
See All Featured
Build your cross-platform service in a week with App Engine
jlugia
234
18k
Exploring the Power of Turbo Streams & Action Cable | RailsConf2023
kevinliebholz
37
6.3k
Design and Strategy: How to Deal with People Who Don’t "Get" Design
morganepeng
133
19k
Building a A Zero-Code AI SEO Workflow
portentint
PRO
0
310
Learning to Love Humans: Emotional Interface Design
aarron
275
41k
Bootstrapping a Software Product
garrettdimon
PRO
307
120k
Writing Fast Ruby
sferik
630
62k
How to Align SEO within the Product Triangle To Get Buy-In & Support - #RIMC
aleyda
1
1.4k
Designing for Performance
lara
610
70k
SEOcharity - Dark patterns in SEO and UX: How to avoid them and build a more ethical web
sarafernandez
0
120
Distributed Sagas: A Protocol for Coordinating Microservices
caitiem20
333
22k
個人開発の失敗を避けるイケてる考え方 / tips for indie hackers
panda_program
122
21k
Transcript
Cascalog
AppsFlyer We use Clojure Used python previously Mobile app conversion
tracking
Challenges User Retention Cohort analysis ~0.5TB of data collected daily
What is Cascalog? A Clojure DSL for writing Hadoop jobs
On top of Cascading
Clojure Modern LISP for the JVM
Why Cascalog? We already know and love Clojure Same tools
- test in the REPL Custom operations are ordinary functions (no UDFs)
Why Cascalog? Cascalog is Datalog Fits our use cases well
Simple (once you know it )
Relations and Tuples Relation 1 Relation 2 t 1 t
2 t 3 t 4 t 1 t 2 t 3 t 4 Relational Model
Query Anatomy
Query Anatomy Order does not matter Grouping is implicit
Why Cascalog? Composition
Why Cascalog? Composition Joins are implicit
Generators Cascalog taps Hadoop and local file systems Clojure sequences
[["alice" 28] ["bob" 33] ["chris" 40] ["david" 25] ["emily" 25] ["george" 31]] Cascalog queries Defined using <-
Demo
The AppsFlyer flow Kafka Secor AWS S3 Data Collection Continuously
saves Kafka topics to HDFS/S3 according to a scheme
The AppsFlyer flow Data Processing Lemur Spins up Hadoop cluster
Submit steps Data processing (using cascalog) Export processed data (using Apache Sqoop) Postgresql 1 2 1 2
Questions ?