Slide 1

Slide 1 text

Apache Chukwa ● What is it ? ● How does it work ? ● What can we collect ? ● Architecture www.semtech-solutions.co.nz [email protected]

Slide 2

Slide 2 text

Chukwa – What is it ? ● For log collection and analysis ● Designed for big data ● Designed for Hadoop ● Uses HDFS and MapReduce ● Scaleable ● Robust ● Provides a tool kit to analyse logs www.semtech-solutions.co.nz [email protected]

Slide 3

Slide 3 text

Chukwa – How does it work ? ● Chukwa agents on source nodes ● Transfer data to collectors which save data to HDFS ● Data sinks contain raw unsorted data ● Data sinks clean data ● Demux adds structure to create Chukwa records ● Chukwa records go to database ● Are ready to be analysed www.semtech-solutions.co.nz [email protected]

Slide 4

Slide 4 text

Chukwa – What can we collect ? ● Metrics ● System logs – Defined format – Undefined format ● Low latency – Access to log data www.semtech-solutions.co.nz [email protected]

Slide 5

Slide 5 text

Chukwa – Architecture ? www.semtech-solutions.co.nz [email protected]

Slide 6

Slide 6 text

Chukwa – Architecture ? ● Chukwa agents – Reside on the Hadoop machines – Collect raw data – Use adaptors for data sources – Use http to transmit data – Operate on data chunks – Can fail over between collectors www.semtech-solutions.co.nz [email protected]

Slide 7

Slide 7 text

Contact Us ● Feel free to contact us at – www.semtech-solutions.co.nz – [email protected] ● We offer IT project consultancy ● We are happy to hear about your problems ● You can just pay for those hours that you need ● To solve your problems