Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Streaming Ingestion & Processing at Flipkart
Search
Siddhartha Reddy
May 15, 2015
Technology
0
400
Streaming Ingestion & Processing at Flipkart
Presented at the Bangalore Hadoop Meetup held on 15th May 2015.
Siddhartha Reddy
May 15, 2015
Tweet
Share
More Decks by Siddhartha Reddy
See All by Siddhartha Reddy
Future Patterns in Data Ecosystem
sids
1
200
CAP Theorem: You don’t need CP, you don’t want AP, and you can’t have CA
sids
6
12k
Other Decks in Technology
See All in Technology
膨大なデータをどうさばく? Java × MQで作るPub/Subアーキテクチャ
zenta
0
110
AIエージェントによるエンタープライズ向けスライド検索!
shibuiwilliam
4
590
OSだってコンテナしたい❗Image Modeが切り拓くLinux OS運用の新時代
tsukaman
0
110
LINEヤフー バックエンド組織・体制の紹介
lycorptech_jp
PRO
0
810
QAを"自動化する"ことの本質
kshino
1
140
ある編集者のこれまでとこれから —— 開発者コミュニティと歩んだ四半世紀
inao
5
3.4k
旧から新へ: 大規模ウェブクローラの Perl から Go への移行 / YAPC::Fukuoka 2025
motemen
3
1.1k
ECS組み込みのBlue/Greenデプロイを動かしてELB側の動きを観察してみる
yuki_ink
1
130
事業状況で変化する最適解。進化し続ける開発組織とアーキテクチャ
caddi_eng
1
900
AI エージェントを評価するための温故知新と Spec Driven Evaluation
icoxfog417
PRO
1
320
仕様は“書く”より“語る” - 分断を超えたチーム開発の実践 / 20251115 Naoki Takahashi
shift_evolve
PRO
1
1.1k
Rubyist入門: The Way to The Timeless Way of Programming
snoozer05
PRO
7
520
Featured
See All Featured
Side Projects
sachag
455
43k
YesSQL, Process and Tooling at Scale
rocio
174
15k
Producing Creativity
orderedlist
PRO
348
40k
Facilitating Awesome Meetings
lara
57
6.6k
Designing Experiences People Love
moore
142
24k
Reflections from 52 weeks, 52 projects
jeffersonlam
355
21k
Testing 201, or: Great Expectations
jmmastey
46
7.8k
Why Our Code Smells
bkeepers
PRO
340
57k
Scaling GitHub
holman
463
140k
[RailsConf 2023] Rails as a piece of cake
palkan
57
6.1k
Fashionably flexible responsive web design (full day workshop)
malarkey
407
66k
ピンチをチャンスに:未来をつくるプロダクトロードマップ #pmconf2020
aki_iinuma
127
54k
Transcript
Streaming Ingestion & Processing at Flipkart Siddhartha Reddy @sids
Flipkart Data Platform (an oversimplified view)
Streaming Ingestion
Choices • push, not pull • schemas & validations
Streaming Ingestion v1.0
None
• Push 㱺 accountability (with source teams) • good call!
• Schemas 㱺 contracts for consumers • can make assumptions that are assured to be true • Insufficient tooling 㱺 too many “ingestion frameworks” • adopt some frameworks & offer as tools! • Synchronous error handling 㱺 complexity • accept all data
Streaming Ingestion v2.0
Stream Processing
An Example
Streaming Joins: Example It works! But… how do we deal
with lookup failures?
Streaming Joins: Handling Failures
None
None
Streaming Joins: Bootstrapping With a little help from MR friends
Streaming Joins: But… The example that doesn’t really work correctly
Streaming Joins
In summary • Streaming Ingestion: push, schemas & validation, HTTP
service, local daemon, change data capture • Streaming Joins: indexing, lookup tables, map-joins, retry queue, batch re-driver sid@flipkart.com