Upgrade to Pro — share decks privately, control downloads, hide ads and more …

An introduction to Apache Whirr

An introduction to Apache Whirr

A short introduction to Apache Whirr. What is it and how does it relate to the cloud ?
How can it be used with Hadoop ?

Mike Frampton

December 14, 2013
Tweet

More Decks by Mike Frampton

Other Decks in Technology

Transcript

  1. Apache Whirr • What is it ? • How does

    it work? • The Cloud • Architecture • Use with Hadoop www.semtech-solutions.co.nz [email protected]
  2. Apache Whirr – What is it ? • A library

    based cloud service system • API libraries for cloud providers • Choose a configuration file to define a cluster • High level interaction • Service based on roles www.semtech-solutions.co.nz [email protected]
  3. Apache Whirr – How does it work ? • Libraries

    provided to offer high level API • Based on JClouds • Recipe based approach for cloud providers i.e. EC2 • It is cloud neutral • Create clusters as you need them for – Dev / test etc www.semtech-solutions.co.nz [email protected]
  4. Apache Whirr – How does it work ? • Automatically

    start instances on the Cloud • Configure and start Hadoop • Add applications like – Hive – Hbase – Yarn / MapReduce www.semtech-solutions.co.nz [email protected]
  5. Apache Whirr – Why go virtual ? • Whirr gives

    independence from Cloud vendor • Makes it easier to move vendors later • Save money by only using what you need • Expand the cluster as demand requires • Reduce the cluster when possible • Compress data as much as possible to reduce costs • Virtual cost < physical cost ( should be ) until – Data sizes in high Tbyte – low Pbyte range www.semtech-solutions.co.nz [email protected]
  6. Apache Whirr – Supported • What cloud suppliers are available

    – Amazon EC2 – Rackspace Cloud Services • What services do they support ? – Cassandra – Hadoop – Zoo Keeper – Hbase – Elastic Search – Voldemort – Hama www.semtech-solutions.co.nz [email protected]
  7. Apache Whirr – Example config Example Whirr configuration - .whirr/credentials

    whirr.provider=aws-ec2 whirr.identity=your-aws-key whirr.credential=your-aws-secret Whirr Hadoop configuration - hadoop.properties whirr.cluster-name=myhadoopcluster whirr.instance-templates=1 hadoop-jobtracker+hadoop-namenode,1 hadoop-datanode+hadoop-tasktracker whirr.provider=aws-ec2 Start Whirr whirr launch-cluster --config hadoop.properties www.semtech-solutions.co.nz [email protected]
  8. Contact Us • Feel free to contact us at –

    www.semtech-solutions.co.nz – [email protected] • We offer IT project consultancy • We are happy to hear about your problems • You can just pay for those hours that you need • To solve your problems