Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Machine Learning for HCI @ NTU CSIE, 2013/7/21

Machine Learning for HCI @ NTU CSIE, 2013/7/21

1b56cc5159a07e4eee8f819c1a2557e9?s=128

Johnson Liang

July 21, 2013
Tweet

More Decks by Johnson Liang

Other Decks in Education

Transcript

  1. Machine Learning for HCI Johnson @ NTU Mobile HCI Lab

  2. ML in Projects iRotateGrasp by Xman 龍哥 SenSleep by Jimmyken

    RingTune (Final project for the course Data Mining)
  3. iRotateGrasp

  4. iRotateGrasp

  5. None
  6. SenSleep

  7. SenSleep Basic Flow

  8. Ringtune

  9. Ringtune

  10. Ringtune 來來電 鈴鈴聲調整 晃動 聲音 趨近感測 光感測 sqlite wifi train

    data
  11. What’s Common in Those Projects? They collects input data. iRotateGrasp:

    44 capacitive sensor values. SenSleep: mobile & PC activities. Ringtune: Ambient sound, accelerations, light.
  12. What’s Common in Those Projects? (2) The data is used

    to determine an output. iRotateGrasp: screen orientation. SenSleep: if the user is sleeping in a time slot. Ringtune: the desired ringer volume (0~7)
  13. What’s Common in Those Projects? (2) The data is used

    to determine an output. iRotateGrasp: screen orientation. SenSleep: if the user is sleeping in a time slot. Ringtune: the desired ringer volume (0~7)
  14. The Core of Decision Making Classifier 分類器 Given: <input, output>

    pairs Goal: Given any inputs, predict the outputs.
  15. Training & Testing a Classifier Training Learns from data <input,

    output> <input, output> <input, output> <input, output> <input, output> <input, output> <input, output> <input, output> <input, output> <input, output> ...... Testing Ask for output <input, ?>
  16. How & What to Train?

  17. How & What to Train? <input, output>

  18. How & What to Train? <input, output> 收哪些資料? 收幾筆?

  19. How & What to Train? <input, output> 收哪些資料? 收幾筆? 「正確答案」

    Ground truth 打哪來?
  20. iRotateGrasp 的作法

  21. iRotateGrasp Prototype

  22. 44 sensors

  23. 44 sensors < s1, s2, 14, ......, s44 > &

    output 44-value input +output 
  24. 44 sensors < s1, s2, 14, ......, s44 > &

    output 44-value input +output LIBSVM Classifier   Chih-Chung Chang and Chih-Jen Lin, LIBSVM : a library for support vector machines. ACM Transactions on Intelligent Systems and Technology, 2:27:1--27:27, 2011
  25. 44 sensors < s1, s2, 14, ......, s44 > &

    output 44-value input +output LIBSVM Classifier   Chih-Chung Chang and Chih-Jen Lin, LIBSVM : a library for support vector machines. ACM Transactions on Intelligent Systems and Technology, 2:27:1--27:27, 2011
  26. LIBSVM Data Collection Session 

  27. LIBSVM Data Collection Session 

  28. LIBSVM Data Collection Session Stand Sit Lie down Lie down

    (side) ❌ 
  29. LIBSVM Data Collection Session Stand Sit Lie down Lie down

    (side) Left hand Right hand Both hands ❌ ❌ 
  30. LIBSVM Data Collection Session Stand Sit Lie down Lie down

    (side) Left hand Right hand Both hands ❌ ❌ ❌ 
  31. LIBSVM Data Collection Session Stand Sit Lie down Lie down

    (side) Left hand Right hand Both hands ❌ ❌ ❌  162,000 samples
  32. SenSleep 的作法

  33. SenSleep Data Collection Mobile Screen Lock/Unlock events Accelerometer values Battery

    charging events Light sensor values Current location change events System-defined broadcast events Desktop PC Idle Intervals 12 participants, 7 days
  34. SenSleep Ground Truth IP Camera @ participants’ bedrooms...... Participants see

    the pictures taken & report actual sleeping time.
  35. That’s A Lot to Learn From!

  36. That’s A Lot to Learn From! Too much raw data

    Requires lots of training data to conclude!
  37. That’s A Lot to Learn From! Too much raw data

    Requires lots of training data to conclude! What really matters? “Features” Features should describe our data better.
  38. SenSleep Features Screen on/off (0 or 1) Elapsed time since

    screen on/off Battery charging on/off Elapsed time since last battery event Current coordinate (location) Offset in location, compared to 15 min before Accelerometer average values Accelerometer median values Elapsed time since last PC keyboard / mouse activity <f1, ..., f9>
  39. SenSleep Features Screen on/off (0 or 1) Elapsed time since

    screen on/off Battery charging on/off Elapsed time since last battery event Current coordinate (location) Offset in location, compared to 15 min before Accelerometer average values Accelerometer median values Elapsed time since last PC keyboard / mouse activity <f1, ..., f9> 9-dimensional feature vector
  40. Training SenSleep Classifier <f1, ..., f9> 9-dimensional feature vector +

    is_sleeping label Input Output
  41. History-Related Feature < f1, ..., f9 >

  42. History-Related Feature < f1, ..., f9 > < last_is_sleeping, f1,

    ..., f9 >
  43. History-Related Feature < f1, ..., f9 > < last_is_sleeping, f1,

    ..., f9 > < last_f1, ..., last_f9, f1, ..., f9 >
  44. RingTune 的作法

  45. Ringer Volume Adj. 來來電 鈴鈴聲調整 晃動 聲音 趨近感測 光感測 sqlite

    wifi train data Implementation »
  46. Incoming Call 來來電 鈴鈴聲調整 晃動 聲音 趨近感測 光感測 sqlite wifi

    train data Implementation »
  47. Collection Pending 來來電 鈴鈴聲調整 晃動 聲音 趨近感測 光感測 sqlite wifi

    train data Implementation »
  48. Turn On Sensors 來來電 鈴鈴聲調整 晃動 聲音 趨近感測 光感測 sqlite

    wifi train data Implementation »
  49. ‹ › Classifier Implementation » avg_x, avg_y, avg_z, var_x, var_y,

    var_z, avg_dx, avg_dy, avg_dz, light, close 11D feature vector & Volume
  50. Feature Extraction in Other Work

  51. Phone Localization Martin et al. MobiCom '09, Duke University SurroundSense:

    mobile phone localization via ambience fingerprinting
  52. Phone Localization Martin et al. MobiCom '09, Duke University SurroundSense:

    mobile phone localization via ambience fingerprinting Sound
  53. Phone Localization Martin et al. MobiCom '09, Duke University SurroundSense:

    mobile phone localization via ambience fingerprinting Sound Color of Light
  54. Phone Localization Martin et al. MobiCom '09, Duke University SurroundSense:

    mobile phone localization via ambience fingerprinting Sound Color of Light Motion
  55. Sound Feature Waveform Waveform (Zoomed to samples) 1 0 -1

    Martin et al. MobiCom '09, Duke University SurroundSense: mobile phone localization via ambience fingerprinting
  56. Color Feature Martin et al. MobiCom '09, Duke University SurroundSense:

    mobile phone localization via ambience fingerprinting
  57. Motion Feature Moving Static Feature: moving average & variance of

    instaneous acceleration Martin et al. MobiCom '09, Duke University SurroundSense: mobile phone localization via ambience fingerprinting
  58. Sensing Grip Pattern Determine On Table / In Hand Thumb

    / Index Finger Left / Right Thumb Pressure Mayank et al. UIST '12, University of Washington GripSense: using built-in sensors to detect hand posture and pressure on commodity mobile phones
  59. Actual Application in Use Hailpern et al. DIS '10, University

    of Illinois at Urbana Champaign The CLOTHO Project: Predicting Application Utility 程式啟動 / 結束 視窗切換 登入登出 開機關機 開機關機 某程式 CPU 用量 某程式 RAM 用量 視窗 z-buffer 桌面大小 視窗大小 視窗座標 視窗可見範圍 Focused App 滑鼠位置 Timestamp 真正在使用的 「重要的程式」 High-utilization Application 當前系統快照 System Snapshot
  60. Document Classification Bag of words John likes to watch movies.

    Mary likes too. John also likes to watch football games.
  61. Document Classification Bag of words John likes to watch movies.

    Mary likes too. John also likes to watch football games. Dictionary John, likes, tp, watch, movies, also, football, games, Mary, too
  62. Document Classification Bag of words John likes to watch movies.

    Mary likes too. John also likes to watch football games. Dictionary John, likes, tp, watch, movies, also, football, games, Mary, too <1, 2, 1, 1, 1, 0, 0, 0, 1, 1> <1, 1, 1, 1, 0, 1, 1, 1, 0, 0> (Multinominal, counts occurrence)
  63. Document Classification Bag of words John likes to watch movies.

    Mary likes too. John also likes to watch football games. Dictionary John, likes, to, watch, movies, also, football, games, Mary, too <1, 1, 1, 1, 1, 0, 0, 0, 1, 1> <1, 1, 1, 1, 0, 1, 1, 1, 0, 0> (Bernoulli, present or not)
  64. 中文的「word」 需要斷詞來得到 feature vector。 Dictionary-based 中研院斷詞系統 Stanford Word Segmenter MMSeg

    n-gram
  65. Evaluating a Classifier

  66. Confusion Matrix (Binary)

  67. Confusion Matrix

  68. Precision, Recall PJ Cheng, Text Categorization, 2013 Web IR Class

  69. None
  70. Accuracy

  71. None
  72. Which data should we use? It has to be labeled.

  73. PJ Cheng, Text Categorization, 2013 Web IR Slides

  74. PJ Cheng, Text Categorization, 2013 Web IR Slides

  75. PJ Cheng, Text Categorization, 2013 Web IR Slides

  76. iRotateGrasp Cross Validation Result Within-subject cross validation Average: 90.4%

  77. iRotateGrasp Cross Validation Result Within-subject cross validation Leave-one-subject-out cross validation

    Average: 80.9% Average: 90.4%
  78. Learning Curve Mayank et al. UIST '12, University of Washington

    GripSense: using built-in sensors to detect hand posture and pressure on commodity mobile phones
  79. Classifying Algorithms

  80. Routine Choose algorithms Tune parameters Compare the results of different

    algos / params
  81. Numerical Function Numeric input & numeric output. Categorical Output -

    Discrete “type” labels Continuous Output - Real values
  82. Artifical Neuro Network

  83. MS Cheng, Neuro Net & SVM, 2012 Data Mining Slides

  84. MS Cheng, Neuro Net & SVM, 2012 Data Mining Slides

  85. MS Cheng, Neuro Net & SVM, 2012 Data Mining Slides

  86. Support Vector Machines

  87. PJ Cheng, Text Categorization, 2013 Web IR Slides

  88. PJ Cheng, Text Categorization, 2013 Web IR Slides

  89. None
  90. None
  91. Probability (Graphical) Models

  92. Naive Bayes

  93. PJ Cheng, Text Categorization, 2013 Web IR Slides

  94. =P(buys_comp=y|X) =P(buys_comp=n|X) MS Cheng, Decision Tree & Naive Bayes, 2012

    Data Mining Slides
  95. Hidden Markov Model (HMM)

  96. HMM o1 o2 s1 s2 s1 s2 s1 s2

  97. None
  98. Other Algorithms

  99. Instance Based: K-Nearest Neighbor (KNN)

  100. PJ Cheng, Text Categorization, 2013 Web IR Slides Choose a

    distance metric to calculate the distances between feature vectors
  101. PJ Cheng, Text Categorization, 2013 Web IR Slides

  102. Trees: J48 (C4.5) Decision Tree

  103. PJ Cheng, Text Categorization, 2013 Web IR Slides

  104. PJ Cheng, Text Categorization, 2013 Web IR Slides

  105. PJ Cheng, Text Categorization, 2013 Web IR Slides

  106. 剛剛講的這些演算法

  107. 老師教啥,Model 學啥

  108. Supervised Learning

  109. ⼀一棵樹分兩邊:supervised, unsupervised 「classifier」/「cluster」 SD Lin, Final Mark on Machine Leaerning,

    2013 PGM Slides
  110. ⼀一棵樹分兩邊:supervised, unsupervised 「classifier」/「cluster」 SD Lin, Final Mark on Machine Leaerning,

    2013 PGM Slides 剛剛教的 幾乎在這
  111. ⼀一棵樹分兩邊:supervised, unsupervised 「classifier」/「cluster」 SD Lin, Final Mark on Machine Leaerning,

    2013 PGM Slides 剛剛教的 幾乎在這 HMM 在這
  112. ⼀一棵樹分兩邊:supervised, unsupervised 「classifier」/「cluster」 SD Lin, Final Mark on Machine Leaerning,

    2013 PGM Slides 剛剛教的 幾乎在這 HMM 在這 AI 會教
  113. ⼀一棵樹分兩邊:supervised, unsupervised 「classifier」/「cluster」 SD Lin, Final Mark on Machine Leaerning,

    2013 PGM Slides 剛剛教的 幾乎在這 HMM 在這 AI 會教 Clustering 分群
  114. 技能樹 Starter Class, 必修 (?) 鄭卜壬 網路資訊檢索與探勘 下學期 修完必修之後會偏涼: 陳銘憲(@EE)

    資訊勘測 上學期 陳信希 自然語言處理 上學期 李琳山 數位語音處理概論 下學期 會用到但沒太大關係的課 于天立(EE) / 許永真 人工智慧 上學期 徐宏民 多媒體資訊分析與檢 索 上學期 前往真理前要打倒的大魔王 林軒田 機器學習 上學期 林守德 機率圖形學習模型 上學期 我很猛想比賽 林智仁 機器學習理論與實務 下學期
  115. Weka

  116. Weka The University of Waikato The WEKA Data Mining Software:

    An Update, 2009 In Java Can be put in Android GUI ! Multiple algorithms implemented Unified input / output format
  117. Weka Explorer Try algorithms Create model files

  118. Weka Explorer - Preprocess Open file Switch to Classify

  119. Weka Explorer - Classify Specify label Choose classifier Set parameter

    Set test options Start
  120. LIBSVM

  121. LIBSVM 林智仁老師 LIBSVM: A library for support vector machines, 2011

    In multiple Languages Can be put in Android & iOS ! Simple install (just make!) Simple input / output format Tutorial: http://www.csie.ntu.edu.tw/~piaip/docs/svm/
  122. LIBSVM Input Format http://www.csie.ntu.edu.tw/~piaip/docs/svm/#

  123. LIBSVM Output format Model file Prediction file One label per

    line Confidence attached if a flag is set.
  124. LIBSVM Binaries svmtrain svmpredict 讀 http://www.csie.ntu.edu.tw/~piaip/docs/svm/#

  125. 217 Workstations 硬體 http://wslab.csie.ntu.edu.tw/hardware/ 家目錄在 NFS 裡 /tmp2 在各台主機的硬碟裡 系統狀態

    http://mrtg.csie.ntu.edu.tw/
  126. 217 Train Stations Usually running grid.py Finding optimal c (cost)

    and g (gamma) SSH authorized_keys setup (google SSH 免密碼) ssh_workers & nr_local_worker 讀 svm 資料夾/tools/README