$30 off During Our Annual Pro Sale. View Details »
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Streaming Ingestion & Processing at Flipkart
Search
Siddhartha Reddy
May 15, 2015
Technology
0
400
Streaming Ingestion & Processing at Flipkart
Presented at the Bangalore Hadoop Meetup held on 15th May 2015.
Siddhartha Reddy
May 15, 2015
Tweet
Share
More Decks by Siddhartha Reddy
See All by Siddhartha Reddy
Future Patterns in Data Ecosystem
sids
1
200
CAP Theorem: You don’t need CP, you don’t want AP, and you can’t have CA
sids
6
12k
Other Decks in Technology
See All in Technology
バグハンター視点によるサプライチェーンの脆弱性
scgajge12
2
510
Docker, Infraestructuras seguras y Hardening
josejuansanchez
0
150
Digitization部 紹介資料
sansan33
PRO
1
6.1k
Symfony AI in Action
el_stoffel
2
370
履歴テーブル、今回はこう作りました 〜 Delegated Types編 〜 / How We Built Our History Table This Time — With Delegated Types
moznion
16
9.5k
Databricksによるエージェント構築
taka_aki
1
120
プロダクトマネージャーが押さえておくべき、ソフトウェア資産とAIエージェント投資効果 / pmconf2025
i35_267
2
360
セキュリティAIエージェントの現在と未来 / PSS #2 Takumi Session
flatt_security
3
1.4k
Master Dataグループ紹介資料
sansan33
PRO
1
4k
その設計、 本当に価値を生んでますか?
shimomura
3
190
pmconf2025 - データを活用し「価値」へ繋げる
glorypulse
0
460
技術以外の世界に『越境』しエンジニアとして進化を遂げる 〜Kotlinへの愛とDevHRとしての挑戦を添えて〜
subroh0508
1
160
Featured
See All Featured
For a Future-Friendly Web
brad_frost
180
10k
KATA
mclloyd
PRO
32
15k
Building Applications with DynamoDB
mza
96
6.8k
Why You Should Never Use an ORM
jnunemaker
PRO
60
9.6k
GraphQLの誤解/rethinking-graphql
sonatard
73
11k
Save Time (by Creating Custom Rails Generators)
garrettdimon
PRO
32
1.8k
What’s in a name? Adding method to the madness
productmarketing
PRO
24
3.8k
Practical Tips for Bootstrapping Information Extraction Pipelines
honnibal
25
1.6k
The Power of CSS Pseudo Elements
geoffreycrofte
80
6.1k
Visualizing Your Data: Incorporating Mongo into Loggly Infrastructure
mongodb
48
9.8k
Learning to Love Humans: Emotional Interface Design
aarron
274
41k
How To Stay Up To Date on Web Technology
chriscoyier
791
250k
Transcript
Streaming Ingestion & Processing at Flipkart Siddhartha Reddy @sids
Flipkart Data Platform (an oversimplified view)
Streaming Ingestion
Choices • push, not pull • schemas & validations
Streaming Ingestion v1.0
None
• Push 㱺 accountability (with source teams) • good call!
• Schemas 㱺 contracts for consumers • can make assumptions that are assured to be true • Insufficient tooling 㱺 too many “ingestion frameworks” • adopt some frameworks & offer as tools! • Synchronous error handling 㱺 complexity • accept all data
Streaming Ingestion v2.0
Stream Processing
An Example
Streaming Joins: Example It works! But… how do we deal
with lookup failures?
Streaming Joins: Handling Failures
None
None
Streaming Joins: Bootstrapping With a little help from MR friends
Streaming Joins: But… The example that doesn’t really work correctly
Streaming Joins
In summary • Streaming Ingestion: push, schemas & validation, HTTP
service, local daemon, change data capture • Streaming Joins: indexing, lookup tables, map-joins, retry queue, batch re-driver sid@flipkart.com