Streaming Ingestion & Processing at Flipkart

Siddhartha Reddy

May 15, 2015

390

Streaming Ingestion & Processing at Flipkart

Presented at the Bangalore Hadoop Meetup held on 15th May 2015.

Siddhartha Reddy

May 15, 2015

Tweet

More Decks by Siddhartha Reddy

See All by Siddhartha Reddy

Future Patterns in Data Ecosystem

1

190

CAP Theorem: You don’t need CP, you don’t want AP, and you can’t have CA

6

11k

Other Decks in Technology

See All in Technology

経験がないことを言い訳にしない、 AI時代の他領域への染み出し方

0

140

Data Engineering Study#30 LT資料

1

560

東京海上日動におけるセキュアな開発プロセスの取り組み

0

130

本当にわかりやすいAIエージェント入門

10

5.9k

Jitera Company Deck / JP

0

140

組織内、組織間の資産保護に必要なアイデンティティ基盤と関連技術の最新動向

0

510

DatabricksのOLTPデータベース『Lakebase』に詳しくなろう！

0

110

PHPからはじめるコンピュータアーキテクチャ / From Scripts to Silicon: A Journey Through the Layers of Computing

2

380

20150719_Amazon Nova Canvas Virtual try-onアプリ作成裏話

0

130

BEYOND THE RAG🚀 ~とりあえずRAG？を超えていけ！本当に使えるAIエージェント＆生成AIプロダクトを目指して~ / BEYOND-THE-RAG-Toward Practical-GenerativeAI-Products-AOAI-DevDay-2025

4

230

M365アカウント侵害時の初動対応

7

4.5k

エンジニアリングマネージャー“お悩み相談”パネルセッション

1

650

Featured

See All Featured

Making Projects Easy

116

6.3k

How to Create Impact in a Changing Tech Landscape [PerfNow 2023]

53

2.9k

Into the Great Unknown - MozCon

40

1.9k

The Myth of the Modular Monolith - Day 2 Keynote - Rails World 2024

26

2.9k

A Tale of Four Properties

160

23k

How To Stay Up To Date on Web Technology

790

250k

Designing Experiences People Love

142

24k

Automating Front-end Workflow

1370

200k

Balancing Empowerment & Direction

1

490

ReactJS: Keep Simple. Everything can be a component!

667

120k

Reflections from 52 weeks, 52 projects

351

21k

Cheating the UX When There Is Nothing More to Optimize - PixelPioneers

stephaniewalter

282

13k

Transcript

Streaming Ingestion & Processing at Flipkart Siddhartha Reddy @sids
Flipkart Data Platform (an oversimpliﬁed view)
Streaming Ingestion
Choices • push, not pull • schemas & validations
Streaming Ingestion v1.0
None
• Push 㱺 accountability (with source teams) • good call!
• Schemas 㱺 contracts for consumers • can make assumptions that are assured to be true • Insufﬁcient tooling 㱺 too many “ingestion frameworks” • adopt some frameworks & offer as tools! • Synchronous error handling 㱺 complexity • accept all data
Streaming Ingestion v2.0
Stream Processing
An Example
Streaming Joins: Example It works! But… how do we deal
with lookup failures?
Streaming Joins: Handling Failures
None
None
Streaming Joins: Bootstrapping With a little help from MR friends
Streaming Joins: But… The example that doesn’t really work correctly
Streaming Joins
In summary • Streaming Ingestion: push, schemas & validation, HTTP
service, local daemon, change data capture • Streaming Joins: indexing, lookup tables, map-joins, retry queue, batch re-driver sid@ﬂipkart.com