Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
AppsFlyer Data Architecture
Search
AppsFlyer
May 21, 2015
Technology
500
0
Share
Embed
Copy iframe code
Copy JS code
Copy link
Start on current slide
AppsFlyer Data Architecture
AppsFlyer
May 21, 2015
More Decks by AppsFlyer
See All by AppsFlyer
Processing 15 Billion events a day without breaking the bank - ReversimX ILTechTalks
appsflyer
0
520
Journey to the Real-Time Analytics in Extreme Growth
appsflyer
0
330
10 Real problems & solutions in your build and deploy process
appsflyer
0
160
DevOps paradigm in R&D day-to-day
appsflyer
0
170
Building a Mobile Backend to Evolve
appsflyer
0
130
Ido Barkan
appsflyer
1
160
Sometimes, Druid is not the best solution for a business use case
appsflyer
1
450
Processing 8 Billion Daily Events in Real Time!
appsflyer
1
150
React Performance
appsflyer
1
240
Other Decks in Technology
See All in Technology
PHP と TypeScript の型システム比較:AI 時代の「型」は誰のためにあるのか? #frontend_phpcon_do / frontend_phpcon_do_2026
shogogg
1
260
サイバーセキュリティ概論 / Introduction to Cybersecurity
ks91
PRO
0
170
EventBridge Connection
_kensh
5
650
LLMにもCAP定理があるという話
harukasakihara
0
160
実装は速くなった、レビューはどうする? ― 自身のレビューをAIで再現させるサーヴァントエンジニアリングのすゝめ / Implementation got faster. So what about reviews? — An invitation to Servant Engineering: Recreating your own code reviews with AI
nrslib
7
4.2k
Claude code Orchestra
ozakiomumkj
3
1k
Oracle AI Database@Azure:サービス概要のご紹介
oracle4engineer
PRO
6
1.9k
Rubyで音を視る
ydah
1
110
AIの性能が向上しても未解決な組織の重大問題は何か?/An Unsolved Organizational Problem in the Age of AI
moriyuya
2
300
TypeScript Compiler APIとPHP-Parserを活用し、TypeScriptとPHPで型を共有する
shuta13
0
370
Socrates × Looker 〜セマンティックレイヤーで進化するデータ分析エージェント〜
hanon52_
1
520
データ基盤をDataformで整えた話 〜 開発環境を添えて 〜
takapy
0
130
Featured
See All Featured
Navigating the moral maze — ethical principles for Al-driven product design
skipperchong
2
380
Marketing to machines
jonoalderson
1
5.4k
Getting science done with accelerated Python computing platforms
jacobtomlinson
2
220
Typedesign – Prime Four
hannesfritz
42
3.1k
Refactoring Trust on Your Teams (GOTO; Chicago 2020)
rmw
35
3.5k
Speed Design
sergeychernyshev
33
1.8k
Let's Do A Bunch of Simple Stuff to Make Websites Faster
chriscoyier
508
140k
Navigating the Design Leadership Dip - Product Design Week Design Leaders+ Conference 2024
apolaine
1
340
How Fast Is Fast Enough? [PerfNow 2025]
tammyeverts
3
600
Product Roadmaps are Hard
iamctodd
PRO
55
12k
Conquering PDFs: document understanding beyond plain text
inesmontani
PRO
4
2.8k
GitHub's CSS Performance
jonrohan
1033
470k
Transcript
Dis$lling insights @
Arnon Rotem-‐Gal-‐Oz Chief Data Officer
None
None
Kafka Columnar Database (Redshift- evaluating Vertica) IMDG (Ignite - evaluating
Geode) Secor Spark Aggregations SparkSQL (evaluating Drill, Presto) SQL SQL Raw (sequence files) DW (parquet files) DM (Aggregations) Application dashboard Self-serve BI (TBD) Spark ETL Spark Spark ML Latest Events Scoring exploration Agg. logic Internal tools installs clicks inapp launches Accounts
Data’s hierarchy of needs* *With apologies to Maslow
Acted upon presented Dis$lled Usable Accessible Exist
Exist
Kafka Columnar Database (Redshift- evaluating Vertica) IMDG (Ignite - evaluating
Geode) Secor Spark Aggregations SparkSQL (evaluating Drill, Presto) SQL SQL Raw (sequence files) DW (parquet files) DM (Aggregations) Application dashboard Self-serve BI (TBD) Spark ETL Spark Spark ML Latest Events Scoring exploration Agg. logic Internal tools installs clicks inapp launches Accounts
Kafka Columnar Database (Redshift- evaluating Vertica) IMDG (Ignite - evaluating
Geode) Secor Spark Aggregations SparkSQL (evaluating Drill, Presto) SQL SQL Raw (sequence files) DW (parquet files) DM (Aggregations) Application dashboard Self-serve BI (TBD) Spark ETL Spark Spark ML Latest Events Scoring exploration Agg. logic Internal tools installs clicks inapp launches Accounts
Working off of RAW data
“Mal$ng” Just slap SQL on everything Accessible
Kafka Columnar Database (Redshift- evaluating Vertica) IMDG (Ignite - evaluating
Geode) Secor Spark Aggregations SparkSQL (evaluating Drill, Presto) SQL SQL Raw (sequence files) DW (parquet files) DM (Aggregations) Application dashboard Self-serve BI (TBD) Spark ETL Spark Spark ML Latest Events Scoring exploration Agg. logic Internal tools installs clicks inapp launches Accounts
Fermen$ng Usable
Kafka Columnar Database (Redshift- evaluating Vertica) IMDG (Ignite - evaluating
Geode) Secor Spark Aggregations SparkSQL (evaluating Drill, Presto) SQL SQL Raw (sequence files) DW (parquet files) DM (Aggregations) Application dashboard Self-serve BI (TBD) Spark ETL Spark Spark ML Latest Events Scoring exploration Agg. logic Internal tools installs clicks inapp launches Accounts
Dis$lling Dis$lled
Kafka Columnar Database (Redshift- evaluating Vertica) IMDG (Ignite - evaluating
Geode) Secor Spark Aggregations SparkSQL (evaluating Drill, Presto) SQL SQL Raw (sequence files) DW (parquet files) DM (Aggregations) Application dashboard Self-serve BI (TBD) Spark ETL Spark Spark ML Latest Events Scoring exploration Agg. logic Internal tools installs clicks inapp launches Accounts
RT insights Predic$ve Prescrip$ve
Dashboards whatnot presented
Sidetrack: On use of Spark
Hadoop & Mesos
None
Land data in a queue
All data is $me-‐series
Enrich with foreign keys before persis$ng
Analyze and balance jobs
Not everything is big data
We’re hiring…. jobs@appsflyer.com