Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Catflap

 Catflap

Talk by Andreas Hübner on the Datageeks meetup 09.2015

MunichDataGeeks

September 25, 2015
Tweet

More Decks by MunichDataGeeks

Other Decks in Technology

Transcript

  1. Ready for cat content? Andreas Hübner - @ahueb cat owner

    / data scientist Munich DataGeeks - September 2015 Edition
  2. I started programming websites as a teenager. Not fully convinced

    of being a sole software developer, I started studying Business Informatics and later on focused on machine learning and mathematical optimization in Paderborn and Helsinki. My Bachelor and Master theses dealt with outlier detection and recommender systems. In my students job, I helped to design operations research solutions. After finishing master studies, Alex Thamm offered me an fantastic position as a consultant for business analytics projects. Since then I've seen one or the other real world data science problems (including predictive maintenance, search for patterns, anomaly detection and some more), team and project. I enjoy studying and solving data problems (even in my spare time...), keeping their potential effects on business in mind. And I like cats.
  3. ?

  4. Waiting Collecting Cooldown No sensor activity Sensor active Sensor active

    No sensor activity for more than 10 seconds Observer
  5. 11111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111 11111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111 11111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111 11111111111111111111111111111111111111111111111111111111100220000000000000000000000000000000000000 00000000000000000000000000000000000000000000000000000000000000000000000000000000000000111111111111 11111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111 11111111111111111111111111111111111111111111111111000000000000000000000000000000000000000000000000 00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000 00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000 00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000

    00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000 00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000 000000000000000011111022000000000000000000000000000222000011111111111111111111111111111111100022 Longest “Zero- segment” (Passage) In oder Out? “Segment” 1. Last direction before longest zero segment 2. First direction 3. Number of in/out segments before the longest zero segment Criteria “Head” (Flap opens)
  6. Kmean clustering Are there meaningfull clusters? Center and scale the

    data! Find out by premutating the whole data set Run some resamples The clusters with the true data should be denser. Isolate the faulty patterns (testing, Mika pring the flap)
  7. Identify faulty patterns on the fly Semi-supervised learning (Self Training)

    Use the cluster labels for patterns Train a model (yes, validate and test…) Apply the model to new patterns Give feedback (send a notification) Store the predicted class Add most confident predictions to train Retrain the model (eventually down- weight by proprensity score and “age”) http://is.tuebingen.mpg.de/fileadmin/user_upload/files/publications/taxo_[0].pdf https://en.wikipedia.org/wiki/Ma% C3%9F#/media/File: Jugg_with_Beer_Loewenbraeu_ one_liter.JPG
  8. Mika comes home before we go to bed and after

    (or right before!!!) we get up
  9. On weekends, mika hesitates before and rushes after 12am At

    night, mika is slower and during daytime
  10. Seasonal patterns? Yes there is. *y scale is free Mika

    wakes us at night during the summer months. Here’s the veidence Less passages during summer. Open doors an warm nights!
  11. 3600 passages recorded 32MB Storage used (small data yay!) about

    150 faulty patterns Production system since 8/2014