Slide 1

Slide 1 text

Apache Giraph ● What is it ? ● How does it work ? ● Dependencies ● Examples www.semtech-solutions.co.nz [email protected]

Slide 2

Slide 2 text

Giraph – What is it ? ● Graph processing for Hadoop V2 ● For tasks that dont fit Map Reduce ● Better performance for those tasks ● Processing by interations called super steps ● Uses Bulk Synchronous Parallel computing ( BSP ) – See Apache Hama presentation ● Licensed via Apache ● For distributed computing ● For massive calculations www.semtech-solutions.co.nz [email protected]

Slide 3

Slide 3 text

Giraph – How does it work ? ● Consider example – Input is chain graph – Find shortest path – Three super steps – Vertices have values – As do edges – Messages between steps www.semtech-solutions.co.nz [email protected]

Slide 4

Slide 4 text

Giraph – Dependencies ● What does Apache Giraph need ? – Java 1.6 – Maven 3 or higher – ZooKeeper – Hadoop ● Yarn ( 2.0.3-alpha ) or ● Version 0.20.x ● So Giraph is graph processing for Hadoop V2 !! ● Based on Google Pregel www.semtech-solutions.co.nz [email protected]

Slide 5

Slide 5 text

Giraph – Examples ● Consider the distance between friends problem – Facebook friends – ( and ) LinkedIn Connections – Shortest distance between friends ● Its a graph ● Process intensive to do as a Map Reduce job ● See next two slides www.semtech-solutions.co.nz [email protected]

Slide 6

Slide 6 text

Giraph – Examples www.semtech-solutions.co.nz [email protected]

Slide 7

Slide 7 text

Giraph – Examples www.semtech-solutions.co.nz [email protected]

Slide 8

Slide 8 text

Contact Us ● Feel free to contact us at – www.semtech-solutions.co.nz – [email protected] ● We offer IT project consultancy ● We are happy to hear about your problems ● You can just pay for those hours that you need ● To solve your problems