MapReduce - Speaker Deck

Slide 1

Slide 1 text

Map Reduce @leriomaggio [email protected] Valerio Maggio, PhD

Slide 2

Slide 2 text

A programming model (and corresponding implementation) for processing and generating large data sets. Map Reduce J. Dean and S. Ghemawat. MapReduce: Simplified data processing on large clusters. Operating Systems Design and Implementation, pages 137–149, 2004.

Slide 3

Slide 3 text

Background Time 2004: The beginning of the multi-core era Context Issues Industry-wide shift towards multi-core machines Google Infrastructure (Cluster-based Computing Environment) —> process large amounts of raw data (e.g. crawled docs, web requests logs) —> Compute various kinds of derived data (e.g. inverted indices) Ad-hoc (Complex) Solutions Input data usually large Parallelise the code Distribute the data Handle failures Key challenges of High throughput computation

Slide 4

Slide 4 text

The idea Move the complexity in the Backend —> (automatic) parallelisation of computation and distribution of data; —> I/O scheduling and Monitoring Allow for simple solutions with “small” input data (user defined) —> Decompose the problem into multiple “smaller” tasks ( divide and conquer ) Each solution must comply with the computational paradigm required by the framework —> Convention over Configuration principle

Slide 5

Slide 5 text

Map Reduce Computational Paradigm map (f, list) -> list reduce (f, list) -> v map (k1 , v1 ) -> list(k2 , v2 ) reduce (k2 , list(v2 )) -> list(v2 ) The Core (functional) primitives adapted synopsis intermediate key/values

Slide 6

Slide 6 text

Programming Model The computation expressed as input/output key/value pairs;. The user defines two functions: map (k1 , v1 ) -> list(k2 , v2 ) reduce (k2 , list(v2 )) -> list(v2 ) Processes input key/value pair Produces set of intermediate pairs Combines all intermediate values for a particular key Produces set of merged output values (usually one) !! Convention: Intermediate data type are strings by default !!

Slide 7

Slide 7 text

Example: Word Count map (key, doc) -> (word, “1”) reduce (word, list(counts)) -> count

Slide 8

Slide 8 text

Example: Reverse Web-Link Graph map (source, doc) -> (target, source) reduce (target, list(source)) ->(target, list(source)) Google Page Rank

Slide 9

Slide 9 text

Execution Overview Source: http://bit.ly/map-reduce-paper Input data —> M partitions; M map tasks; M >> R > W R reduce tasks 200K; 5K; 2K W Workers ≠ Tasks Partitioning function: hash(key) mod R Intermediate keys —> R splits Reader Combiner Remote Procedure Call R output files —> input another MR job Worker status map

Slide 10

Slide 10 text

Fault Tolerance: Re-Execution On a worker failure: —> Detect failures by periodic heartbeats —> Re-execute completed and in-progress map tasks; —> Re-execute in-progress reduce tasks; —> Task completion committed through master Master failure (very unlikely) —> Writing checkpoints periodically & re-execution

Slide 11

Slide 11 text

(additional) Refinements Locality (mapper) —> (thousands of machines) read input at local disk(s) speed —> (otherwise) rack-switches limit read rate Skipping Bad Records —> on seg-fault send UDP packet to master (w/ seq. no. record) —> if master sees two failures from the same sector => skip Reducer: —> Custom Combiner: to save net bandwidth in communication —> Compression of intermediate data