Slide 1

Slide 1 text

Machine Learning for HCI Johnson @ NTU Mobile HCI Lab

Slide 2

Slide 2 text

ML in Projects iRotateGrasp by Xman 龍哥 SenSleep by Jimmyken RingTune (Final project for the course Data Mining)

Slide 3

Slide 3 text

iRotateGrasp

Slide 4

Slide 4 text

iRotateGrasp

Slide 5

Slide 5 text

No content

Slide 6

Slide 6 text

SenSleep

Slide 7

Slide 7 text

SenSleep Basic Flow

Slide 8

Slide 8 text

Ringtune

Slide 9

Slide 9 text

Ringtune

Slide 10

Slide 10 text

Ringtune 來來電 鈴鈴聲調整 晃動 聲音 趨近感測 光感測 sqlite wifi train data

Slide 11

Slide 11 text

What’s Common in Those Projects? They collects input data. iRotateGrasp: 44 capacitive sensor values. SenSleep: mobile & PC activities. Ringtune: Ambient sound, accelerations, light.

Slide 12

Slide 12 text

What’s Common in Those Projects? (2) The data is used to determine an output. iRotateGrasp: screen orientation. SenSleep: if the user is sleeping in a time slot. Ringtune: the desired ringer volume (0~7)

Slide 13

Slide 13 text

What’s Common in Those Projects? (2) The data is used to determine an output. iRotateGrasp: screen orientation. SenSleep: if the user is sleeping in a time slot. Ringtune: the desired ringer volume (0~7)

Slide 14

Slide 14 text

The Core of Decision Making Classifier 分類器 Given: pairs Goal: Given any inputs, predict the outputs.

Slide 15

Slide 15 text

Training & Testing a Classifier Training Learns from data ...... Testing Ask for output

Slide 16

Slide 16 text

How & What to Train?

Slide 17

Slide 17 text

How & What to Train?

Slide 18

Slide 18 text

How & What to Train? 收哪些資料? 收幾筆?

Slide 19

Slide 19 text

How & What to Train? 收哪些資料? 收幾筆? 「正確答案」 Ground truth 打哪來?

Slide 20

Slide 20 text

iRotateGrasp 的作法

Slide 21

Slide 21 text

iRotateGrasp Prototype

Slide 22

Slide 22 text

44 sensors

Slide 23

Slide 23 text

44 sensors < s1, s2, 14, ......, s44 > & output 44-value input +output 

Slide 24

Slide 24 text

44 sensors < s1, s2, 14, ......, s44 > & output 44-value input +output LIBSVM Classifier   Chih-Chung Chang and Chih-Jen Lin, LIBSVM : a library for support vector machines. ACM Transactions on Intelligent Systems and Technology, 2:27:1--27:27, 2011

Slide 25

Slide 25 text

44 sensors < s1, s2, 14, ......, s44 > & output 44-value input +output LIBSVM Classifier   Chih-Chung Chang and Chih-Jen Lin, LIBSVM : a library for support vector machines. ACM Transactions on Intelligent Systems and Technology, 2:27:1--27:27, 2011

Slide 26

Slide 26 text

LIBSVM Data Collection Session 

Slide 27

Slide 27 text

LIBSVM Data Collection Session 

Slide 28

Slide 28 text

LIBSVM Data Collection Session Stand Sit Lie down Lie down (side) ❌ 

Slide 29

Slide 29 text

LIBSVM Data Collection Session Stand Sit Lie down Lie down (side) Left hand Right hand Both hands ❌ ❌ 

Slide 30

Slide 30 text

LIBSVM Data Collection Session Stand Sit Lie down Lie down (side) Left hand Right hand Both hands ❌ ❌ ❌ 

Slide 31

Slide 31 text

LIBSVM Data Collection Session Stand Sit Lie down Lie down (side) Left hand Right hand Both hands ❌ ❌ ❌  162,000 samples

Slide 32

Slide 32 text

SenSleep 的作法

Slide 33

Slide 33 text

SenSleep Data Collection Mobile Screen Lock/Unlock events Accelerometer values Battery charging events Light sensor values Current location change events System-defined broadcast events Desktop PC Idle Intervals 12 participants, 7 days

Slide 34

Slide 34 text

SenSleep Ground Truth IP Camera @ participants’ bedrooms...... Participants see the pictures taken & report actual sleeping time.

Slide 35

Slide 35 text

That’s A Lot to Learn From!

Slide 36

Slide 36 text

That’s A Lot to Learn From! Too much raw data Requires lots of training data to conclude!

Slide 37

Slide 37 text

That’s A Lot to Learn From! Too much raw data Requires lots of training data to conclude! What really matters? “Features” Features should describe our data better.

Slide 38

Slide 38 text

SenSleep Features Screen on/off (0 or 1) Elapsed time since screen on/off Battery charging on/off Elapsed time since last battery event Current coordinate (location) Offset in location, compared to 15 min before Accelerometer average values Accelerometer median values Elapsed time since last PC keyboard / mouse activity

Slide 39

Slide 39 text

SenSleep Features Screen on/off (0 or 1) Elapsed time since screen on/off Battery charging on/off Elapsed time since last battery event Current coordinate (location) Offset in location, compared to 15 min before Accelerometer average values Accelerometer median values Elapsed time since last PC keyboard / mouse activity 9-dimensional feature vector

Slide 40

Slide 40 text

Training SenSleep Classifier 9-dimensional feature vector + is_sleeping label Input Output

Slide 41

Slide 41 text

History-Related Feature < f1, ..., f9 >

Slide 42

Slide 42 text

History-Related Feature < f1, ..., f9 > < last_is_sleeping, f1, ..., f9 >

Slide 43

Slide 43 text

History-Related Feature < f1, ..., f9 > < last_is_sleeping, f1, ..., f9 > < last_f1, ..., last_f9, f1, ..., f9 >

Slide 44

Slide 44 text

RingTune 的作法

Slide 45

Slide 45 text

Ringer Volume Adj. 來來電 鈴鈴聲調整 晃動 聲音 趨近感測 光感測 sqlite wifi train data Implementation »

Slide 46

Slide 46 text

Incoming Call 來來電 鈴鈴聲調整 晃動 聲音 趨近感測 光感測 sqlite wifi train data Implementation »

Slide 47

Slide 47 text

Collection Pending 來來電 鈴鈴聲調整 晃動 聲音 趨近感測 光感測 sqlite wifi train data Implementation »

Slide 48

Slide 48 text

Turn On Sensors 來來電 鈴鈴聲調整 晃動 聲音 趨近感測 光感測 sqlite wifi train data Implementation »

Slide 49

Slide 49 text

‹ › Classifier Implementation » avg_x, avg_y, avg_z, var_x, var_y, var_z, avg_dx, avg_dy, avg_dz, light, close 11D feature vector & Volume

Slide 50

Slide 50 text

Feature Extraction in Other Work

Slide 51

Slide 51 text

Phone Localization Martin et al. MobiCom '09, Duke University SurroundSense: mobile phone localization via ambience fingerprinting

Slide 52

Slide 52 text

Phone Localization Martin et al. MobiCom '09, Duke University SurroundSense: mobile phone localization via ambience fingerprinting Sound

Slide 53

Slide 53 text

Phone Localization Martin et al. MobiCom '09, Duke University SurroundSense: mobile phone localization via ambience fingerprinting Sound Color of Light

Slide 54

Slide 54 text

Phone Localization Martin et al. MobiCom '09, Duke University SurroundSense: mobile phone localization via ambience fingerprinting Sound Color of Light Motion

Slide 55

Slide 55 text

Sound Feature Waveform Waveform (Zoomed to samples) 1 0 -1 Martin et al. MobiCom '09, Duke University SurroundSense: mobile phone localization via ambience fingerprinting

Slide 56

Slide 56 text

Color Feature Martin et al. MobiCom '09, Duke University SurroundSense: mobile phone localization via ambience fingerprinting

Slide 57

Slide 57 text

Motion Feature Moving Static Feature: moving average & variance of instaneous acceleration Martin et al. MobiCom '09, Duke University SurroundSense: mobile phone localization via ambience fingerprinting

Slide 58

Slide 58 text

Sensing Grip Pattern Determine On Table / In Hand Thumb / Index Finger Left / Right Thumb Pressure Mayank et al. UIST '12, University of Washington GripSense: using built-in sensors to detect hand posture and pressure on commodity mobile phones

Slide 59

Slide 59 text

Actual Application in Use Hailpern et al. DIS '10, University of Illinois at Urbana Champaign The CLOTHO Project: Predicting Application Utility 程式啟動 / 結束 視窗切換 登入登出 開機關機 開機關機 某程式 CPU 用量 某程式 RAM 用量 視窗 z-buffer 桌面大小 視窗大小 視窗座標 視窗可見範圍 Focused App 滑鼠位置 Timestamp 真正在使用的 「重要的程式」 High-utilization Application 當前系統快照 System Snapshot

Slide 60

Slide 60 text

Document Classification Bag of words John likes to watch movies. Mary likes too. John also likes to watch football games.

Slide 61

Slide 61 text

Document Classification Bag of words John likes to watch movies. Mary likes too. John also likes to watch football games. Dictionary John, likes, tp, watch, movies, also, football, games, Mary, too

Slide 62

Slide 62 text

Document Classification Bag of words John likes to watch movies. Mary likes too. John also likes to watch football games. Dictionary John, likes, tp, watch, movies, also, football, games, Mary, too <1, 2, 1, 1, 1, 0, 0, 0, 1, 1> <1, 1, 1, 1, 0, 1, 1, 1, 0, 0> (Multinominal, counts occurrence)

Slide 63

Slide 63 text

Document Classification Bag of words John likes to watch movies. Mary likes too. John also likes to watch football games. Dictionary John, likes, to, watch, movies, also, football, games, Mary, too <1, 1, 1, 1, 1, 0, 0, 0, 1, 1> <1, 1, 1, 1, 0, 1, 1, 1, 0, 0> (Bernoulli, present or not)

Slide 64

Slide 64 text

中文的「word」 需要斷詞來得到 feature vector。 Dictionary-based 中研院斷詞系統 Stanford Word Segmenter MMSeg n-gram

Slide 65

Slide 65 text

Evaluating a Classifier

Slide 66

Slide 66 text

Confusion Matrix (Binary)

Slide 67

Slide 67 text

Confusion Matrix

Slide 68

Slide 68 text

Precision, Recall PJ Cheng, Text Categorization, 2013 Web IR Class

Slide 69

Slide 69 text

No content

Slide 70

Slide 70 text

Accuracy

Slide 71

Slide 71 text

No content

Slide 72

Slide 72 text

Which data should we use? It has to be labeled.

Slide 73

Slide 73 text

PJ Cheng, Text Categorization, 2013 Web IR Slides

Slide 74

Slide 74 text

PJ Cheng, Text Categorization, 2013 Web IR Slides

Slide 75

Slide 75 text

PJ Cheng, Text Categorization, 2013 Web IR Slides

Slide 76

Slide 76 text

iRotateGrasp Cross Validation Result Within-subject cross validation Average: 90.4%

Slide 77

Slide 77 text

iRotateGrasp Cross Validation Result Within-subject cross validation Leave-one-subject-out cross validation Average: 80.9% Average: 90.4%

Slide 78

Slide 78 text

Learning Curve Mayank et al. UIST '12, University of Washington GripSense: using built-in sensors to detect hand posture and pressure on commodity mobile phones

Slide 79

Slide 79 text

Classifying Algorithms

Slide 80

Slide 80 text

Routine Choose algorithms Tune parameters Compare the results of different algos / params

Slide 81

Slide 81 text

Numerical Function Numeric input & numeric output. Categorical Output - Discrete “type” labels Continuous Output - Real values

Slide 82

Slide 82 text

Artifical Neuro Network

Slide 83

Slide 83 text

MS Cheng, Neuro Net & SVM, 2012 Data Mining Slides

Slide 84

Slide 84 text

MS Cheng, Neuro Net & SVM, 2012 Data Mining Slides

Slide 85

Slide 85 text

MS Cheng, Neuro Net & SVM, 2012 Data Mining Slides

Slide 86

Slide 86 text

Support Vector Machines

Slide 87

Slide 87 text

PJ Cheng, Text Categorization, 2013 Web IR Slides

Slide 88

Slide 88 text

PJ Cheng, Text Categorization, 2013 Web IR Slides

Slide 89

Slide 89 text

No content

Slide 90

Slide 90 text

No content

Slide 91

Slide 91 text

Probability (Graphical) Models

Slide 92

Slide 92 text

Naive Bayes

Slide 93

Slide 93 text

PJ Cheng, Text Categorization, 2013 Web IR Slides

Slide 94

Slide 94 text

=P(buys_comp=y|X) =P(buys_comp=n|X) MS Cheng, Decision Tree & Naive Bayes, 2012 Data Mining Slides

Slide 95

Slide 95 text

Hidden Markov Model (HMM)

Slide 96

Slide 96 text

HMM o1 o2 s1 s2 s1 s2 s1 s2

Slide 97

Slide 97 text

No content

Slide 98

Slide 98 text

Other Algorithms

Slide 99

Slide 99 text

Instance Based: K-Nearest Neighbor (KNN)

Slide 100

Slide 100 text

PJ Cheng, Text Categorization, 2013 Web IR Slides Choose a distance metric to calculate the distances between feature vectors

Slide 101

Slide 101 text

PJ Cheng, Text Categorization, 2013 Web IR Slides

Slide 102

Slide 102 text

Trees: J48 (C4.5) Decision Tree

Slide 103

Slide 103 text

PJ Cheng, Text Categorization, 2013 Web IR Slides

Slide 104

Slide 104 text

PJ Cheng, Text Categorization, 2013 Web IR Slides

Slide 105

Slide 105 text

PJ Cheng, Text Categorization, 2013 Web IR Slides

Slide 106

Slide 106 text

剛剛講的這些演算法

Slide 107

Slide 107 text

老師教啥,Model 學啥

Slide 108

Slide 108 text

Supervised Learning

Slide 109

Slide 109 text

⼀一棵樹分兩邊:supervised, unsupervised 「classifier」/「cluster」 SD Lin, Final Mark on Machine Leaerning, 2013 PGM Slides

Slide 110

Slide 110 text

⼀一棵樹分兩邊:supervised, unsupervised 「classifier」/「cluster」 SD Lin, Final Mark on Machine Leaerning, 2013 PGM Slides 剛剛教的 幾乎在這

Slide 111

Slide 111 text

⼀一棵樹分兩邊:supervised, unsupervised 「classifier」/「cluster」 SD Lin, Final Mark on Machine Leaerning, 2013 PGM Slides 剛剛教的 幾乎在這 HMM 在這

Slide 112

Slide 112 text

⼀一棵樹分兩邊:supervised, unsupervised 「classifier」/「cluster」 SD Lin, Final Mark on Machine Leaerning, 2013 PGM Slides 剛剛教的 幾乎在這 HMM 在這 AI 會教

Slide 113

Slide 113 text

⼀一棵樹分兩邊:supervised, unsupervised 「classifier」/「cluster」 SD Lin, Final Mark on Machine Leaerning, 2013 PGM Slides 剛剛教的 幾乎在這 HMM 在這 AI 會教 Clustering 分群

Slide 114

Slide 114 text

技能樹 Starter Class, 必修 (?) 鄭卜壬 網路資訊檢索與探勘 下學期 修完必修之後會偏涼: 陳銘憲(@EE) 資訊勘測 上學期 陳信希 自然語言處理 上學期 李琳山 數位語音處理概論 下學期 會用到但沒太大關係的課 于天立(EE) / 許永真 人工智慧 上學期 徐宏民 多媒體資訊分析與檢 索 上學期 前往真理前要打倒的大魔王 林軒田 機器學習 上學期 林守德 機率圖形學習模型 上學期 我很猛想比賽 林智仁 機器學習理論與實務 下學期

Slide 115

Slide 115 text

Weka

Slide 116

Slide 116 text

Weka The University of Waikato The WEKA Data Mining Software: An Update, 2009 In Java Can be put in Android GUI ! Multiple algorithms implemented Unified input / output format

Slide 117

Slide 117 text

Weka Explorer Try algorithms Create model files

Slide 118

Slide 118 text

Weka Explorer - Preprocess Open file Switch to Classify

Slide 119

Slide 119 text

Weka Explorer - Classify Specify label Choose classifier Set parameter Set test options Start

Slide 120

Slide 120 text

LIBSVM

Slide 121

Slide 121 text

LIBSVM 林智仁老師 LIBSVM: A library for support vector machines, 2011 In multiple Languages Can be put in Android & iOS ! Simple install (just make!) Simple input / output format Tutorial: http://www.csie.ntu.edu.tw/~piaip/docs/svm/

Slide 122

Slide 122 text

LIBSVM Input Format http://www.csie.ntu.edu.tw/~piaip/docs/svm/#

Slide 123

Slide 123 text

LIBSVM Output format Model file Prediction file One label per line Confidence attached if a flag is set.

Slide 124

Slide 124 text

LIBSVM Binaries svmtrain svmpredict 讀 http://www.csie.ntu.edu.tw/~piaip/docs/svm/#

Slide 125

Slide 125 text

217 Workstations 硬體 http://wslab.csie.ntu.edu.tw/hardware/ 家目錄在 NFS 裡 /tmp2 在各台主機的硬碟裡 系統狀態 http://mrtg.csie.ntu.edu.tw/

Slide 126

Slide 126 text

217 Train Stations Usually running grid.py Finding optimal c (cost) and g (gamma) SSH authorized_keys setup (google SSH 免密碼) ssh_workers & nr_local_worker 讀 svm 資料夾/tools/README