Slide 1

Slide 1 text

How Deep is Deep Learning? Amar Lalwani Lead Engineer, R & D, funtoot Ph.D. Candidate, IIIT-Bangalore

Slide 2

Slide 2 text

In a Classroom ..

Slide 3

Slide 3 text

Homogeneous Teaching

Slide 4

Slide 4 text

Two Sigma Problem

Slide 5

Slide 5 text

Two sigma problem

Slide 6

Slide 6 text

Funtoot: Intelligent Tutoring System • Every child is unique • Personalised (One-on-One) Tutoring • Mastery Learning

Slide 7

Slide 7 text

Funtoot: Journey so far ..

Slide 8

Slide 8 text

Student Data 1. Q1 => solved 2. Q2 => unsolved 3. Q3 => solved 4. Q1 => unsolved 5. Q3 => solved 6. Q4 => unsolved 7. Q1 => solved 8. Q2 => unsolved

Slide 9

Slide 9 text

Student Data (Contd..) • How much does the student know? • Ask the student!

Slide 10

Slide 10 text

Knowledge Tracing (KT) • For some skill K • Given student’s response sequence 1 to n, predict n+1 0 0 0 1 1 1 ? 1 ………..……… n n+1 Chronological response sequence for student Y [ 0 = Incorrect response 1 = Correct response]

Slide 11

Slide 11 text

How do we approach this? • Modelling learner’s knowledge acquisition process • Fairly complex • Need a • General model • Flexible model • Powerful model

Slide 12

Slide 12 text

Play the best card: Deep Learning

Slide 13

Slide 13 text

Deep Knowledge Tracing (DKT) • Recurrent Neural Networks (RNNs)

Slide 14

Slide 14 text

Deep Knowledge Tracing (DKT) • RNN or LSTM Model 0.9,0.3,0.2 0.8,0.2,0.1 0.8,0.5,0.3 Q1 Q2 Q3 …. Skill A Skill B Skill C 1.0,0.3,0.7 pCorrect(Skill A), pCorrect(Skill B), pCorrect(Skill C)

Slide 15

Slide 15 text

Inter-skill Relationships

Slide 16

Slide 16 text

Dataset • 6th Grade Math CBSE Curriculum • 22 topics, 69 sub-topics, 119 sub-sub-topics • 442 skills (LGs), 1523 problems • 7780 students, 176 schools • 2.4 million problem attempts • 5.6 million data-points • 76% avoidances (positive class:1)

Slide 17

Slide 17 text

Results: Accuracy 0.75 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 Deep

Slide 18

Slide 18 text

Results: Accuracy 0.75 0.65 0.6 0.62 0.64 0.66 0.68 0.7 0.72 0.74 0.76 Deep Shallow

Slide 19

Slide 19 text

Results: Accuracy

Slide 20

Slide 20 text

Bayesian Knowledge Tracing (BKT) Learned (know) UnLearned (Does not know) Incorrect Correct P(L0 ) 1-P(L0 ) P(T) 1-P(G) 1-P(S) P(G) P(S)

Slide 21

Slide 21 text

BKT: Parameters • BKT: 2-state Hidden Markov Model (HMM) • P(L0 ): Probability of Initial Knowledge • P(T): Probability of Learning • P(S): Probability of Slip • P(G): Probability of Guess

Slide 22

Slide 22 text

Shallow* Vs Deep Shallow* Deep Shallow* = Deep Performance Parameters 4 x # skills pInit, pLearn, pGuess, pSlip Few hundred thousand parameters Interpretability

Slide 23

Slide 23 text

Deep Model: Advantages • Intelligent Curriculum Design • Finding best sequence of tasks • Discovery of structure • Instead of skill labels, question labels can be used as input • Complex representations and features

Slide 24

Slide 24 text

is Deep Learning really DEEP?

Slide 25

Slide 25 text

References • Knowledge Tracing: Modelling the acquisition of procedural knowledge (Corbett et. Al., 1995) • Bloom’s Two Sigma Problem (1984) • Deep Knowledge Tracing (Peich et. Al., 2015) • How deep is Knowledge Tracing? (Khajah et. Al., 2016) • Few hundred parameters outperform few hundred thousand? (Lalwani et. Al.), 2017

Slide 26

Slide 26 text

Thank You! Questions??