Upgrade to Pro — share decks privately, control downloads, hide ads and more …

An Introduction to Apache Oozie

Sponsored · Ship Features Fearlessly Turn features on and off without deploys. Used by thousands of Ruby developers.

An Introduction to Apache Oozie

An Introduction to Apache Oozie, what is it and what is it used
for ? How is it used with Hadoop ?

Avatar for Mike Frampton

Mike Frampton

July 17, 2013
Tweet

More Decks by Mike Frampton

Other Decks in Technology

Transcript

  1. Apache Oozie • What is it ? • Why use

    it ? • Architecture • Examples www.semtech-solutions.co.nz [email protected]
  2. Oozie – What is it ? • Work flow scheduler

    for Hadoop • Manages Hadoop Jobs • Integrated with many Hadoop apps i.e. Pig • Scaleable • Schedule jobs • A work flow is a collection of actions i.e. – map/reduce, pig, hfs • A work flow is – Arranged as a DAG ( direct acyclic graph ) – Graph stored as hPDL ( XML process definition ) www.semtech-solutions.co.nz [email protected]
  3. Oozie – Why use it ? • It is designed

    for Hadoop • It is open source • It is designed for big data • It allows you to design task work flow • It allows you to interact with jobs – Stop, start, suspend, resume, rerun www.semtech-solutions.co.nz [email protected]
  4. Oozie – Architecture • Install Oozie on edge node /

    not on cluster • Oozie has client – Launches jobs and talks to server • Ozzie has server – Controls jobs – Launches jobs • Pipelines – Chained workflows – Work flow output – Is input to next www.semtech-solutions.co.nz [email protected]
  5. Contact Us • Feel free to contact us at –

    www.semtech-solutions.co.nz – [email protected] • We offer IT project consultancy • We are happy to hear about your problems • You can just pay for those hours that you need • To solve your problems