Upgrade to Pro — share decks privately, control downloads, hide ads and more …

An introduction to Apache Chukwa

An introduction to Apache Chukwa

A introduction to Apache Chukwa, what is it and
how does it work ? Why is it important to monitor
Hadoop DFS and how can it help us ?

Mike Frampton

August 12, 2013
Tweet

More Decks by Mike Frampton

Other Decks in Technology

Transcript

  1. Apache Chukwa • What is it ? • How does

    it work ? • What can we collect ? • Architecture www.semtech-solutions.co.nz [email protected]
  2. Chukwa – What is it ? • For log collection

    and analysis • Designed for big data • Designed for Hadoop • Uses HDFS and MapReduce • Scaleable • Robust • Provides a tool kit to analyse logs www.semtech-solutions.co.nz [email protected]
  3. Chukwa – How does it work ? • Chukwa agents

    on source nodes • Transfer data to collectors which save data to HDFS • Data sinks contain raw unsorted data • Data sinks clean data • Demux adds structure to create Chukwa records • Chukwa records go to database • Are ready to be analysed www.semtech-solutions.co.nz [email protected]
  4. Chukwa – What can we collect ? • Metrics •

    System logs – Defined format – Undefined format • Low latency – Access to log data www.semtech-solutions.co.nz [email protected]
  5. Chukwa – Architecture ? • Chukwa agents – Reside on

    the Hadoop machines – Collect raw data – Use adaptors for data sources – Use http to transmit data – Operate on data chunks – Can fail over between collectors www.semtech-solutions.co.nz [email protected]
  6. Contact Us • Feel free to contact us at –

    www.semtech-solutions.co.nz – [email protected] • We offer IT project consultancy • We are happy to hear about your problems • You can just pay for those hours that you need • To solve your problems