Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Jason Bell

Tara Simpson
November 11, 2013

Jason Bell

Slides from Jason Bell's recent presentation at 'Bash', a semi-regular developer event held in Belfast (http://instil.co/2013/09/18/shiny-bash/).

Introduction to Hadoop and Machine Learning

Jason is technical architect for SportsFusion, he's previously worked with Learning Pool, AirPOS, The Press Association and others in Northern Ireland, England and California. His main areas of expertise lie in Java, Hadoop and RabbitMQ messaging with a large leaning on customer prediction data and analytics. He also teaches core programming concepts for the University of Ulster and is also working on a book on Machine Learning for a 2014 release.

Tara Simpson

November 11, 2013
Tweet

More Decks by Tara Simpson

Other Decks in Programming

Transcript

  1. What is Hadoop?! A real quick overview (AP)! Word count

    == Hello World! ! A more useful example!
  2. What is Hadoop?! A real quick overview (AP)! Word count

    == Hello World! ! A more useful example! Machine Learning!
  3. What is Hadoop?! A real quick overview (AP)! Word count

    == Hello World! ! A more useful example! Machine Learning! Q & A
  4. An open source framework for storing and large scale processing

    of data sets on clusters of commodity hardware.! !
  5. AP = Audience Participation! ! (and I know you’ll all

    hate me for it but I really don’t care that much, we’ll still be friends....)
  6. 20,000 customers 1. The average 12 month sales.! 2. The

    variance! 3. The month 13 sales drop against the average! 4. Number months which are 40% below avg.
  7. 20,000 customers So far we’ve only output to one file.

    Now I want to segment my customers into groups I can market to.
  8. >>Recommending Stuff 196 242 3 881250949 186 302 3 891717742

    22 377 1 878887116 244 51 2 880606923